M201 12and13 THE OPEN UNIVERSITY g 
Mathematics: A Second Level Course 
Linear Mathematics Unit 12 and13 


Linear Functionals and Duality 
Systems of Differential Equations 


J 


The Open University 


Mathematics: A Second Level Course 


Linear Mathematics 


Unit 12 LINEAR FUNCTIONALS AND DUALITY 
Unit 13 SYSTEMS OF DIFFERENTIAL EQUATIONS 


Prepared by the Linear Mathematics Course Team 


The Open University Press 


Unit 12 Linear Functions and Duality 


The Open University Press, Walton Hall, Milton Keynes. MK7 6AA: 


First published 1972. Reprinted 1976 
Copyright © 1972 The Open University 


All rights reserved. No part of this work may 
be reproduced in any form, by mimeograph 
or any other means, without permission in 
writing from the publishers. 


Designed by the Media Development Group of the Open University. 


Printed in Great Britain by 
Martin Cadbury 


SBN 335 01103 9 


This text forms part of the correspondence element of an Open University 
Second Level Course. The complete list of units in the course is given at 
the end of this text. 


For general availability of supporting material referred to in this text, 
please write to the Director of Marketing, The Open University, P.O. Box 
81, Walton Hall, Milton Keynes, MK7 6AT. 


Further information on Open University courses may be obtained from 
the Admissions Office, The Open University, P.O. Box 48, Walton Hall, 
Milton Keynes, MK7 6AB. 7 


1.2 


Contents 


12.1 


12.1.1 
12.1.2 
12.1.3 
12.1.4 
12.1.5 


12.2 


12.2.1 
12.2.2 
12.2.3 
12.2.4 


12.3 


12.3.0 
12.3.1 
12.3.2 
12.3.3 
12.3.4 


12.4 


12.5 


Set Books 
Conventions 
Introduction 


Linear Functionals 


The Definition of a Linear Functional 

The Dual Space of V 

The Dual Basis 

Using the Dual Basis—an Example (Optional) 
Summary of Section 12.1 


Duality 


Exchanging Function and Domain 
Duality and Linear Functionals 

Is 7 the Whole of P? 

Summary of Section 12.2 


Duality in Action 


Introduction 

Factor Analysis (Optional) 
The Delta Function (Optional) 
Annihilators 

Summary of Section 12.3 


Summary of the Unit 


Self-Assessment 


Set Books 

D. L. Kreider, R. G. Kuller, D. R. Ostberg and F. W. Perkins, An Intro- 
duction to Linear Analysis (Addison-Wesley, 1966). 

E. D. Nering, Linear Algebra and Matrix Theory (John Wiley, 1970). 

It is essential to have these books; the course is based on them and will 


not make sense without them. 


Conventions 


Before working through this correspondence text make sure you have 
read A Guide to the Linear Mathematics Course. Of the typographical 
conventions given in the Guide the following are the most important. 


The set books are referred to as: 


K for An Introduction to Linear Analysis 
N for Linear Algebra and Matrix Theory 


All starred items in the summaries are examinable. 


References to the Open University Mathematics Foundation Course Units 
(The Open University Press, 1971) take the form Unit M100 3, Operations 
and Morphisms. 


LM 12 


12.0 INTRODUCTION 


This unit introduces you to two important mathematical concepts, finear 
Junctionals and duality. The first of these concepts is quite easy to under- 
stand—in fact, as you will see, a linear functional is just a special case of a 
linear transformation. Duality, however, is less concrete. Instead of being a 
rigidly défined mathematical object, it is a series of related properties 
possessed by various kinds of mathematical object. It is an idea that mathe- 
maticians have been aware of for a very long time, but it has been given a 
rigorous mathematical expression only as recently as 1945. Duality has 
played a very important part in the recent development of higher mathe- 
matics, and if you continue with mathematics to a higher level, you may 
study areas in pure mathematics which have been developed with its help. 
But in this unit, the main aim is to introduce you to the idea of duality in the 
specific context of linear functionals. 


The portion of Nering covered in this unit is quite small—Sections 1 and 2, 
the last paragraph of Section 3, and part of Section 4, of N Chapter IV. 
It is written in a very condensed style, even for Nering, and we would 
Suggest that, in this case, you should work through the appropriate 
section of N as part of the summary of each section of this text. 


By a linear functional we mean simply a linear transformation whose 
codomain is whatever field of scalars we are working with (i.e. normally 
the field R). There are many examples of such transformations. One is the 
transformation from R? to R which maps each ordered pair (x, y) in the 
domain R? to its first member x; another is the transformation from C0, 1] 
(the set of all continuous real-valued functions with domain (0, 1]) to R 


1 
which maps each function fin C[0, 1] to its definite integral | f. Linear 
o 


functionals also arise very naturally whenever we want to solve a system of 
algebraic equations; for example, in the system 


2x+y=5 
2x- 3y=1 


the left-hand side of the first equation can be written (x, y) = 5, where tc) 
is the linear functional that maps (x, y) to 2x + y. They also arise naturally 
in linear programming* problems. Suppose, for example, that you wanted 
to devise a diet, consisting, say, of a apples per week, b bananas, c cab- 
bages and so on, which would provide the necessary amounts of protein, 
fat, carbohydrate and the various vitamins, at the minimum cost. The cost 
of a week’s supply of food is then 


Aa + Bb + Cette 


where A is the price of an apple, B that of a banana and so on, and this can 
be written ġ(a, b, c,...) where @ is the linear functional 


(a, b, ¢,...) —— Aa + Bb + Co+>:: 


Mathematically, the problem of devising this diet is the same as that of 
finding the minimum value of the linear functional ¢ subject to certain 
conditions on (a, b, c...) which represent the conditions that the diet con- 
tains enough protein, enough fat, etc., and that all the quantities of food 
must be greater than or equal to 0. The linear functional itself gives a 
mathematical representation of the price structure within which the diet is 
being worked out. For our problem it contains exactly the same information 
as a price list giving the numbers [4, B, C, ...], which you might see in a 
greengrocer’s shop. (The reason why we use square brackets here will be 
explained later in the unit.) 


* Linear programming is the subject of Unit 18 of this course. 


LM 12.0 


The idea of duality arises in roughly the following way. We have just seen 
that the price list [4, B, C,...] corresponds to a linear functional ¢, 
whose domain is the set of all possible shopping lists (a, b, c, ...); this 
functional represents the way the price list affects the customer, who wants 
to know how her total food bill depends on what she decides to buy. But 
from the greengrocer’s point of view, things look rather different. He wants 
to know whether he can increase his profit by altering the price list, and so 
one of the things he is interested in is how the total cost of a given 
shopping list depends on his prices. In other words, he thinks of the cust- 
omer’s shopping list as determining a function whose domain is the set of 
all the price lists he might use: 


Ww: [A, B, C,...]-— Aa t+ Bh + Cot... 
This is another linear functional. 


Mathematically, these two viewpoints give rise to two vector spaces: the 
customer’s space, in which the shopping list is thought of as a vector and 
the price list as a linear Functional, and the greengrocer’s space in which the 
price list is thought of as a vector, and the shopping list asalinear functional. 
These two spaces are said to be dual vector spaces. This dual point of view 
turns out to be very useful in linear programming problems, such as the 
diet problem mentioned above. (Actually, shopping lists such as these do 
not strictly speaking form vector spaces over R, because, for example, if « 
is in the space then —a is not. Nevertheless, they serve to make the mathe- 
matical point we are interested in.) 


12.1 LINEAR FUNCTIONALS 
12.1.1 The Definition of a Linear Functional 


The object of this sub-section is to make precise the mathematical de- 
finition of a linear functional. As we said in the Introduction to the unit, a 
linear functional is simply a linear transformation whose codomain is the 
field of scalars with which we are working. 


vector space 


over F 


But this implies that we are considering the field of scalars to be a vector 
space over itself, since the codomain of a linear transformation is a vector 
space. Can we do this? 


Let us be specific about this. We know that R?, R?, and so on, are vector 
spaces, with bases {(1, 0), (0, 1)} (for R), {(1, 0, 0), (0, 1,0), (0, 0, 1} 
(for R°), and so on. It would be very odd indeed if we could not tack R it- 
self (i.e. R!) on to the beginning of this series of vector spaces, and use the 
single element 1 of R as a basis. We can indeed do this*; all the axioms of 
addition and scalar multiplication work if we regard the set of “vectors” 
in the vector space as just a copy of the set of “scalars”. The operation of 
“scalar multiplication” of a “vector” a by a “scalar” b simply results in 
the element ba, which is perfectly well defined in terms of the multiplication 
operation in R. Conceptually, this multiplication should not be thought of 
as the multiplication of two vectors in the “ vector space” copy of R, but 
rather as the result of multiplying a ‘‘ vector” in R by a “scalar” in R. 


For example, if we agree to embolden the elements of R when they are 
acting as vectors, then 


24+3=5 
23. =6 (scalar multiplication of a vector), 
23 =6 (multiplication within the field of scalars). 


In practice, it is not necessary to keep the two roles of R separate; this 
does not, after all, affect the results of any calculations. In general, for any 
field F, the above argument applies word for word, and we can define a 
linear functional as follows. 

Definition 


If Vis a vector space over a field F, then a linear functional on V is a function 
@ with domain V and codomain F, such that ¢ is a linear transformation, 
i.e. 


plax + bh) = apla) + bolg) (a, be F; a, BEV). 


Examples 
1. The function from R? to R 
$: (Xn X2; X3) — X1 (Xn X2, x3) € R°) 


is a linear functional because its domain is a vector space over the field R, 
its codomain is R, and it is linear. 


* See Example 1 on page K7, which we discussed in Uait 1. 


Proof of linearity: if a, 6 are any scalars, and € = (x,, X2, X3) and 1 = 
Wis Y2, Y3) are any vectors in R?, we have 


(2) = xy 
O(n) =y: 
G(aE + bn) = Plax, + by,, ax + by, ax, + bys) 
= ax, + by, 
= ad(2) + bln) 


2. The function 
Y: (Xi X2, X3) —> X; + 2x2 + 3x3 +c (G1, X2, %3) ER?) 


where c is a real number, is a linear functional if and only if c = 0, since 
the image of the zero vector must be zero in order for to be linear. 


To summarize, a mapping whose domain is a real vector space has to 
satisfy two requirements to qualify as a linear functional: 


(i) it must be a linear transformation; 
(ii) its codomain must be R. 


Exercises 


1. Do Exercise 1 on page NI31, ignoring the two sentences containing the 
symbol A. 


2. Which of the following mappings are linear functionals on P}, the 
vector space consisting of all real polynomial functions of degree 2 or 
less? 


(a) fı f, the derived function 

o s— fis 

(©) fi> F, where F is the primitive function of f defined by 
F:x-—> f i f (eR) 


(d) f-—+ fQ) 
© fr—f@ 


(f) f-— P 
o 


Solutions 


1. (a), (c) and (d) are linear functionals. 
(b) is not, since, for example. 


$(2x,, 0, 0) = 4x} # 26(x, 0, 0). 
(e) is not, since, for example, 
(0, 0, 0) # 0. 


2. Only (b), (d) and (e) are linear functionals. The mappings (a) 
and (c) are not, because they map P, to P, and P4 respectively, 
not to R. The mapping (f) maps P, to R, but it is not linear. 


12.1.2 The Dual Space of V 


In this sub-section we obtain the mathematical concept corresponding to the 
“‘greengrocer’s space” discussed in the Introduction to this unit. If the 
original space (“‘customer’s space ”) with typical element say (a, b, c, ...) is 
a vector space V then each “ price list ”, typically [A, B, C, ...], gives rise toa 
linear functional on V 


(a, b, c, ...) ——+ Aa + Bb + Cet 


The set of all such linear functionals is called the dual space of V and de- 
noted by P (usually pronounced “ V hat” or" V cap”). The main results of 
this sub-section of the unit are that V is itself a vector space, and that if V 
has finite dimension n, then so does P. 


To make it a little easier to visualize the elements of V before embarking on 
the proofs of these statements, we consider the matrix representation of a 
linear functional @. The general rule for matrix representations (see 
page N38) is that each column of the matrix contains the coordinates (with 
respect to the codomain basis) of the image of one of the domain basis 
vectors. If the domain has dimension x and codomain dimension m, then 
the matrix has m rows and n columns. Thus, in our case, if the domain V is 
n-dimensional, there are n columns in the matrix; the codomain is one- 
dimensional, and so the matrix has only one row. Thus any linear functional 
over an n-dimensional vector space V can be represented by a one-row 
matrix of the form 


B= [b, by +" By). (1) 


This is the reason why we used square brackets to describe the price lists 
in our shopping example. You will recall from Unit 2, Linear Transforma- 
tions, that (x1, ..., x,) With round brackets is a space-saving way of writing 


Xi 


Xn. 


This is the representation for the elements of V itself. The images under the 
linear functional represented by B are then of the form 


BX = bx, + bX +1 + Bx 
We studied this representation in Unit 2 (see page N41). 
It is fairly obvious from the one-row matrix representation (1) for linear 


functionals over V, that these functionals form a vector space and that its 
dimension is n. We now set about proving these statements. 


Theorem 1 


The set of all linear functionals on V forms a vector space. (You may re- 
member that we have already proved this result in Unit 2, where we saw 
that the set of all linear transformations from one vector space to another 
forms a vector space. See page N30. But it is important enough to go over 
again.) 


Proof 


First of all, what do we mean by asserting that the linear functionals on V 
form a vector space? What operations would count as vector addition and 
scalar multiplication ? 


The answer is that linear functionals are functions, and we add them by 
means of the corresponding operations on the codomain. That is to say, 
if @ and y are linear functionals, then we define ¢ + Y by: 


(p + We) = pa) +¥@) (we) 


LM 12.1.2 


and similarly, we define ag by: 


Gapa) =a) = (ae V,aeF) 
The next stage in the proof is to check that ¢ + w and a@ are actually 
linear functionals. To this end, we let b, c be any scalars, and f, y be any 
vectors in V, and check that: 


C + WEB + cy) = H + WH) + clh + WO); 2) 


(ad)(bB + cy) = b(ag)(A) + c(ag)(y). (3) 
For Equation (2) 


(P + WEB + cy) = GOB + cy) + WOB + cy) 
= boB) + coly) + bW(B) + eb) 
= DHC) + WCB) + CA) + yO) 


= bp + YB) + eo + WO) 
For Equation (3) 


(aG)(OB + cy) = a(b + cy)) 
= a(bG(B) + ch) 
= abp(B) + acg(y) 
= b(ag)(B) + c(agy(y). 
It is a routine matter to check that the laws of associativity, distributivity, 
etc., hold for the above operations. If you feel that you need practice at 


some of them, then by all means work your way down the list of axioms on 
pages N7-8. Otherwise, don’t bother. 


Let us now set this new space in its context. First of all, we need a name 
and symbol for it. 


Definition 


The vector space of all linear functionals on V is called the dual space of V, 
and is denoted by P. 


We constructed P as a function space; that is we defined addition and scalar 
multiplication in terms of the corresponding operations in the codomain 
of the functions which are the elements of P. This is not a new idea: besides 
spaces of polynomials, continuous functions, continuously differentiable 
functions, etc., we have seen linear transformations from V to U added and 
multiplied by scalars: 


(o + )(@) = o(a) + t(a) 
(a0)(«) = a(o(@)). 


Here too, we could conveniently find the sum ø + t by finding the sum of 
the matrices representing them, In fact, as we saw in Unit 2, the set of all 
linear transformations from V to U is itself a vector space, and P is merely 
a particular case of this, when U is just the field of scalars. 


Before discussing Theorem: 1 we saw that it is convenient to represent 
elements of P by one-row matrices; the number of rows in the matrix of 
a transformation is equal to the dimension of the codomain, which is in this 
case 1. We can use this matrix representation to find the dimension of P. 
After all, to find the dimension of P, we look for a basis of P, and it is very 


strongly to be presumed that the required basis consists of elements having 
matrices 


O 0... 0} 
fo 1... OL 
(0 0... 1p. 


This gives a strong hint about the dimensionrof P, which is the subject of 
the next theorem. 


10 


LM 12.1.2 


Theorem 2 
If V is an n-dimensional vector space, then so is P. 


The following exercise is a preparation for the proof of Theorem 2. 


Exercise 
1 
Let {c, %2, &3} be the standard basis of R? | i.e. in matrix form, a, = | 0f, 
0 
0 0 A 
a =|1|, æ =|0| J, and let ¢,, 62,3 and the elements of R? with 
0 1 


matrices [1 0 0], [0 1 0], [0 O 1] respectively. 
Find $;(a,) for each i, j =1, 2, 3. 


Solution 
¢(a;) is the ith coordinate of «,. Thus: 
$:(@,)=1, $1@2)=0, $1(@3) = 0, 
$2(0;)=0, $2(@2)=1, $2(a:) =0, 
Pale) =0, P3(%2)=0, (a3) = 1, 
or, more concisely, 
_fl ifi=j 
P) Sio iti gj 


We can write this even more concisely using the Kronecker delta, 
which we met in Unit 1, Vector Spaces, (see page N15): 


(a) = 6 


All that is necessary now, is to generalize from R° to the general 
case. 


Proof of Theorem 2 


Let {a,,..., %,} be a basis of V. From the result of the exercise above, it 
would appear that our strategy should be to find a set {@,,..., Ọna} of 
elements of P, obeying 

o(ap=5; j=l... n), (4) 


and prove that it is a basis. Such a set is to hand at once; for we saw in 
Unit 2 (Theorem 1.17, page N34) that a linear transformation from V to U 
is uniquely defined by its effect on a basis of V. For each fixed i, Equation 
(4) defines the effect of @; on each of a, @2,..., €n: 

Qil) = Sir, P82) = Sin, -o Pilea) = Sin - 
Thus {¢,, ..-, ¢,} is completely and uniquely specified by Equation (4) and 
all we have to do is to prove that: 


(a) {¢1,.--, Øn} is a linearly independent set; 
(b) {dis -+s a} spans P. 


We now set about proving each of these in turn. 


Proof of (a) 


To say that {¢,,..., @n} is linearly independent is to say that no linear 
combination of the @, (except the trivial combination 06; + Of, +` + 
0¢,) can be the zero functional. So we will take an arbitrary linear combin- 
ation a,¢, +° + 4,@,, and show that if this is equal to the zero function- 
al, then a, =a, = t =a, = 0. 


i 


Now if a, + °°: + a$, is the zero functional, then the image of every 
vector in V is zero under this functional. Consider ,, for instance. 


(api Ho + anPn)() = aila) + abal) + +++ + apala). 
But, by Equation (4), 
$1) = $3(41) =° = ale) = 0, 
and 
y(a,) = 1. 
Thus 
(ahi + + G,b,)04) = a. 
By an exactly similar argument, 


(ao, +: ut 4,,)(%2) = a2, 


(arpi + + anp) = an: 


So, if we assume that the image of every vector in V is zero, then it must be 
the case that a, = a, = +++ =a, = 0. And this is exactly what we require in 
order to assert that {¢,,..., Øn} is linearly independent. 


Proof of (b) 


To say that{¢,, ..., ¢,} spans P is to say that, if Y is any element of P, then 
yw is a linear combination of @,,..., n- 


Now w is defined by its action on the basis {a,,..., @,} of V. If 


W@) =a 

Y) =a 

VG) = an, 
then fa; >+ a] is the matrix representing y with respect to the basis 
{a,,..., @,} of V. We may take the matrix representatives of iy +++) Gy as 


[1 0---0),...,[0°-0 1); then, since 
[a aal OO] +--+ +4,[0°--0 1], 
it is presumably the case that 
ab, +H + nhn. (5) 


Since a linear transformation is completely described by its effect on a 
basis, all we have to do is to ensure that the two sides of Formula (5) have 
the same effect on every element of a basis of V: 


Wla) = (ap, + +> + 4,6,)(@,) 
Yla) = Gor to + a,G,)(ct2) 
We) = (ids +--+ aN) 
This is easy; for any j = 1, ..., n, we have 
(api tot CEAO] aa 4,,(a;) Shee a; ,(a;) a aaa 4,0 ,(a;) 


=a, (by Equation (4)) 
= Wa) (by the definition of y). 
So we have proved that {;, ..., @,} spans P. 


Thus {@;, ..., @,} is a basis of P, which is therefore n-dimensional. 


12 


LM 12.1.2 


Exercise 


If V is 3-dimensional and {a,, a, a3} is any basis for it, calculate ¢(a), 


where a = a, + 2a, — /3a3, and ¢ = 49, — S$, + 3b, with 1, $2, 
$; defined as in Equation (4), 


Solution 

pila) = $,(@; + 2a, — WETA) =1 
(first coordinate of «) 

$2(0) = pala, + 20, —/3a3) = 2 
(second coordinate of «) 

$30) = pala, + 20g — S303) = - /3. 
(third coordinate of «) 
Thus 


ola) = 4H, (0) — Shl) + V300) 
=(4x1)+(=5x 2) + (/3 x -/3 
=4-10-3=-9. 


12.1.3 The Dual Basis 


We now have two distinct ways of representing a linear functional on V by 
numbers. On the one hand, we can think of it as a linear transformation 
from V to F and represent it (with respect to some given basis {a,, ..., &,} 
in V) by a one-row matrix of the form 


Ib; bz... bnl. 


On the other hand, we can regard it as an element of the vector space P, 
and represent it by the n-tuple of coordinates with respect to a basis 
{Qi -<-> n} in P; if the linear functional is 


bihi t+ bahn» 
this n-tuple is 
(By, ~- -> bn). 


Obviously it would be convenient if the numbers b,, ..., b, in the two 
representations were the same ; and we can achieve precisely this by taking 
{@,, -- -> Dy} to be the basis defined in the preceding sub-section, by means 
of the formula $,(«;) = 4,; (see Equation (4), page C11). 


Because of this useful property we give the basis {#,,..., ¢,} in P a special 
name. 

Definition 

If {a,,..., @,} is a basis of V, then the basis {¢,,...,,} of P defined by 


Qila) = 93; @j=l,....n) 
is called the dual basis of P (i.e. the basis of P dual to {o,, ..., o,}). 


If {a,, ..., a,} is denoted by the symbol A, then we denote the dual basis 
by A. 


Although the basis A has a “dual”, it does not follow that the individual 
vectors comprising it have individual duals. There is a natural one-one 
correspondence between bases in V and bases in P, but not between vectors 
in V and vectors in Ŷ. Perhaps the best way to illustrate this is to take a 
basis {o4,, &2} of R?, with the dual basis {@,, $2} of P, then change just one 
of the basis elements of V. We find that both elements change in the dual 
basis (see the following example), which shows that this particular form 
of “duality” is a property of the basis as a whole, rather than individual 
vectors. 


13 


Example 
“nS 
Let 4 = {(1, 0), (0, 1)}, A ={[1 0], [0 1]}-be the standard bases of R?, R?. 
Consider a new basis 
A’ ={(-1, 2), 0, D} 


for R?, in which the first basis element is different from the first basis 
element of A, but the second basis element is the same as the second basis 


element of A. What is the dual basis P? We can determine this as follows. 
Let 

f 

A’ = {$1 2} = {la b], [e al}. 
Then 


b1(-1,2)=1, (0,1) =0 
$2(-1,2) =0, (0,1) =1 


-a+2b=1, b=0 
=c +2d=0, d=1 


so that 
4; =[-1 0], 4) = [2 1], 


and both elements of the dual basis are different from the corresponding 
elements of the original dual basis. (It may interest you to notice, however, 
that ġ, and ġ; span the same subspace of P. This is related to the fact that 
the second element of A is the same as the second element of A’.) 
Example 


What happens if we simply take for A’ the negatives of the elements of A, 
Mis 


ie. if A ={—0, —82, ...; —a,}? We would intuitively expect A’ to be 
just the negatives of the elements of A in this case, and this expectation is 
N 


justified. We begin with the definitions of A and 4’: 
Qia) = 53; (1) 
Piles) = ôy (2) 
Equation (2) implies 
Qila) = oy 
so that 
Qila) 7 ~by 
= —,(a,), by Equation (1). 
As this is true for all i, j, it follows that 


gi = -9¢; G=1,...,n). 


Exercises 


1. Let A ={q,, ..., @,} be the basis of Pdual to the basis A = {a1 6.0, GF 
of V. 


What are the bases dual to: 


G) (2a, 2a.,..., 20,}, 
Gi) {ayoy, a202,.-., 4,0,} 


where @,,..., 4, are scalars, none of them being equal to zero? 


2. Let V be a 2-dimensional vector space, with basis A = {a,, a2}. 


Let the dual basis A of P be {$,, $5}. 


@ 


Let A’ = {aj + 203, 30, + 40}. Find £. 


Gi) Nowlet A’ = {a,0, + a0, G30, + a402}. 


Write out equations to determine 


4 = {bibs + baba, bab, + baha). 


Gii) Write out the last equations as a matrix equation, 


(iv) 


n-dimensional dual space. 


3. Exercise 5, page N132. 


Solutions 


iG 


G) 


Gi) 


ila) = by 


that is 

$i(20;) = 6i 

2gi(a,;) = oy 

Qila) = ô; 
= 4(,(a,)) for all relevant i, j. 

Thus, 

$=36; (=l. 
and 

“ A 

A’'=4A4 
Hila) = ij 
that is 


Pi(aja,) = òy 
ajbi) = oi 
Ha)=— by (as.4j #0) 


i 


The only time ĝ;; is not zero is when i = j, and so 
1 1 
ao =, U 


4; 


1 
E l). 
Thus for each fixed i, 
1 
Gi (;) = a (CACA) 


as j varies from 1 to n. 
Thus 


Hard G@=1,..., 7). 


Let A’ ={¢, 63} ={ady + bz, coy +62). 
We have the equations 
$1 (01) = ad, + bf2)(a, + 2a2) = 1 
Q103) = (ap, + b$2)(3a, + 402) = 0, 
palai) = (cb, + db), + 202) = 0, 
pala) = (cb, + db2)(3a, + 402) = 1 


Take a guess at a matrix equation for finding a dual basis in an 


(3) 


16 


Gi) 


(ii) 


(iv) 


That is 
a+25=1, oem (4) 
c+2d=0, 3c+4d=1 
Solving these equations gives 
a=-2, b=3, c=1, d=-} 
Thus 
¢, = — 26, + 4¢2, 
$2 =o, -442 
and so 


f= (26,4 Fad — ia) 


In this case 
pi) = Oih: + bza, + 22%) = 1 
Qile) = (bii + brp2)(age, + aga) = 0 
Qali) = (shy + baQ)(a1%; + a202) = 0 
P5(a2) = (63h; + b4h2)(a38; + asa) = 1 
Thus 
bya, + b,a, = 1, bia, + b2a4 = 0, 
b,a, + b,a, = 0, bya, + bya, = 1. 


1 by] [ay a] | yr 
ba balla, a4 0 1 
(If in (ii) you wrote the equations out as a,b, + a,b, = 1, 
etc., then you would get the matrix equation 


a a) [by Az 
a3 a4|[b. ba 


from which the equation 


b, by) la, a SF 

b, bajlaz ag 
can be obtained simply by taking the transpose of each 
side: remember that (4B)? = BTAT.) 


Tn (iii), we see that 
RP=I 


where the columns of P are the coordinates of the new 
basis of V with respect to the original basis and the rows 
of R are the coordinates of the new dual basis of P with 
Tespect to the original dual basis. It is therefore a 
reasonable guess that this equation holds in n dimensions 
also. Since this means that R =P! we have a rule: 


To find the coordinates of the new dual basis of P with 
respect to the old dual basis, write out the matrix 
whose columns consist of the coordinates of the new 
basis of V with respect to the old basis. Invert it, and 
read off the rows of the inverse matrix. 


This can be put in terms of matrices of transition, You saw 
in Unit 3, Hermite Normal Form, (page N50) that the 
matrix whose columns are coordinates of a new basis with 
respect to an old basis, is called the matrix of transition. 
Thus, in this case, P is the matrix of transition from the 
old basis A of V to the new basis A’, 


However, the matrix R is not the matrix of transition 
ex “NN 
from the old dual basis A to the new dual basis A’, since 


it is the rows of R, and not the columns, that express the 
new dual basis in terms of the old. This is quite easy to 
rectify, though; for if we let Q be the transpose of R, then 
the columns of Q are the rows of R, and so Q is the matrix 


of transition from A to A’. 


Thus the relation between the matrices of transition P,Q 
in Vand P respectively, is: 


Q=R 


Q=(P-4). 
3. (See N’s answer on page N332.) 


The problem asks us to show that when a is expressed as a 
linear combination of the basis vectors {a,,..., a,}, the 
coefficient of a; is $,(«); in other words, that if 


A = a0, °° + a0, (5) 


then 


a; = hila) G@=1,...,7). 


What do we know about ¢,? It is defined here as an element 
of the dual basis, i.e. 


$ i(;) = Oy. (6) 


The number we want is $;,(«). Using the linearity of 6;, we 
can get this from Equations (5) and (6): 


Hilt) = agila) + ++ + aila) + +++ + apila) 
=a, xO0t +a, xlt+--+a,x0 


=a). 


Since this holds for i = 1, 2, ..., n, Equation (5) for a can now 
be written 


a = by (aay +++ + hylan, 
which is the result we want. 


Notice how the solution uses the fact that {a,,...,a,} is a 
basis: the fact that it spans V justifies writing « in the form of 
Equation (5), and the fact that it is linearly independent makes 
the numbers a; appearing in Equation (5) unique, and so 
justifies taking Equation (5) as our definition of these numbers. 


The rule given in the solution to Exercise (2) is confirmed in the following 
theorem. We have left blanks in the proof for you to fill in. You should not 
have too much difficulty; if you do, read the solutions to the exercises in 
this section again, carefully. Remember also that, if p,; is a typical matrix 
entry of a matrix P, then the first suffix i refers to the row which p;; is in, 
while the second suffix j refers to the column. 


Theorem 3 


Let V bea vector space with basis A = {a,, ..., %}, and let A= lbr basa bn} 
be the dual basis in P. If A’ is a new basis of V, whose coordinates with 
respect to A are the columns of an z x n matrix P, then the coordinates of 
“NN 


A’ with respect to A are the rows of an n x n matrix R, where 


RP=I 


Raps 


= 


v 


A= {,,.--Pnt 


R=10;,..5Pa} 


Tn other words, if P is the matrix of transition from A to A’ and Q is the 
matrix of transition from A to A’, then since Q is thus equal to RT, we 
obtain the equation 


Q=(P-')’. 
Proof 
If the components of A’ = {a}, ..., a/} with respect to A are the columns 
of P = [p,,], then since p,;,..., Pa, are the elements of the jth column of 
the matrix P, 
g= > 0) 
k=1 
fi a 
If the components of A’ = {¢ġ1, .--, $n} with respect to A are the rows of 
R= [rj], then, since ri,- -., i are the elements of the ith row of the 
matrix R, 
n 
i= E (ii) 
Thus 
non 
wo > v hle) Gii) 


But since },(@,) = ô we only need to sum over /, and we can set k =]. 
Thus 


diel) = 5 (iv) 


t51 


But the left-hand side of (iv) is the (i/)th entry of the matrix — —ċ v) 
and the right-hand side is the (i/)th entry of the matrix — Žž (vi) 


Thus RP = J, as required, and since the matrix of transition Q is RT, 
the equation 


Q=(P 


> We get 


18 


The blanks should be filled in as follows: 
n n 
(i) a; 5 È Paste Gi) gi = Eruh 


Gi) PE Yrursdied Gv) ge) sòru 01 
(vi) RP 
Exercise 


Exercise 2, page N131. (If you have forgotten how to calculate the inverse 
of a matrix, see page N61.) 


Solution 


(a) Writing the components of {(1, 0, 0), (0, 1, 0), (0, 0, 1)} as the 
columns of a matrix P simply makes P into the unit matrix J, 
which is its own inverse. Reading off the rows of I: 


A={[l 0 0, [0 1 oO, fo 0 1} 


(b) 1 1l 
P=/0 11 
001 


We can invert P by finding the Hermite normal form of 


111 100 
oli 0 1 oj, 
001 001 


which is 


Thus: 


A={{l -1 0], [0 1 =}, [0 0 1}. 


© 1-1 0 
P=| 0 1 i 
-1 0 1 


The Hermite normal form of 


Thus 
A=((} 4--4H,[-$ 4-4, $ 4p. 


12.1.4 Using the Dual Basis—an Example (Optional) 
The “ dual basis ” concept is important in the study of crystals, because of 
the regularity of structure of a crystal. 


The simplest sort of crystal is one whose basic structure is that of parallele- 
pipeds*, as in the following diagram. 


To make the thing easier to visualize, we will consider a ‘2-dimensional 
crystal”, i.e. a splitting up of the 2-dimensional x, y plane into parallelo- 
gram-shaped “‘ crystal cells”. 


A crystallographer is interested in functions that are periodic in the crystal; 
that is, functions whose values depend only on the position within a given 
“cell” of the crystal, and not on which “cell” is being considered. It is 
clearly sensible to consider such functions, for owing to the regularity of 
structure of the crystals, one would expect such a quantity as electric 
potential, or mechanical stress, to vary within a cell but to be independent 
of which cell one was in. (Strictly speaking, this would not apply to cells 
which are near the surface of the crystal, where strange things might be 
happening, but for a large crystal such an assumption would certainly 
seem reasonable for cells deep inside the crystal.) 


* A parallelepiped is a polyhedron figure whose faces are all parallelograms. 


20 


Mathematically, such a function is characterized as follows: 
Je = fla + may + ne) (n, 22 € Z) 
in the 2-dimensional case, or 
JO = fle + myo +n + ngs) (ty, nz, N36 Z) 
in the 3-dimensional case. 


That is, the image under the function is unchanged if the element in the 
domain is translated by a whole number of “ steps”, each of which trans- 
lates a point within one cell to a corresponding point in an adjacent cell. 
The obvious functions to choose in order to try to obtain periodicity, are 
sine and cosine functions, as these are periodic: 


sin x = sin (x + n, x 2m) (neZ) 
(Imagine a “ one-dimensional crystal” whose “cell” is of length 27.) 


In the 2- and 3-dimensional cases, sines and cosines of vector quantities are 
not defined, so we must convert the vector quantities «4, a2, a3, etc., into 
scalars. What could be more natural than to do this by applying linear 
functionals to them? 


So we want to find linear functionals ¢; such that the functions 
Si: a — sin (¢,(@)) 


are periodic with respect to the required crystal cells. It turns out that the 
appropriate $; can be obtained from the dual basis to the basis {a}. 


The functions 
a —> sin (27k, («)) 
a œ cos (2nk i (a), (k,eZ) 
are periodic with respect to the crystal cells, because, for example, 
sin Qnk, (a + nja) = sin 2nk[$,(a) + nh: 
=sin Zakil + nó] 
= sin (2nk,¢,()). 


Further, they can be used to form a basis for the set of functions in which 
the crystallographer is interested. 


21 


12.1.5 Summary of Section 12.1 


In this section we defined the terms 


linear functional (page C5) 
dual space (page C10) 
dual basis (page C13) 
Theorems 
1. (page C9) 


The set of all linear functionals on V forms a vector space. 


2. (page C11) 

If V is n-dimensional, so is P. 

3. (page C18) 

Let V be a vector space with basis A and let A be the dual basis in P. If A’ 
is a new basis of Y and the matrix of transition from A to A’ is P, then the 
coordinates of A’ with respect to A are the columns of Q, the matrix of 
transition from A to A’, and Q = (P7*)7. 


Technique 


Given a basis A of a vector space V, determine the corresponding dual basis 
A in the dual vector space P. 


Notation 
labe] (page C9) 
4 (page C9) 
A (page C13) 


Vectors in V are represented by one-column matrices. Vectors in V are 
represented by one-row matrices. 


READ Section IV-lof N, starting on page N129, and Section IV-3, starting on 


page N134, as far as line 19, page N135, and then from line-10, page N137 
to the end of the section. 


22 


12.2 DUALITY 
12.2.1 Exchanging Function and Domain 


In the example considered in the Introduction to this unit, there was a 
symmetrical relationship between the “ customer’s space ” and the “ green- 
grocer’s space”. The customer’s linear functionals (price lists) correspond 
to the greengrocer’s vectors, and the customer’s vectors correspond to the 
greengrocer’s linear functionals. 


When we generalized this idea to vector spaces, we did so in a rather un- 
symmetrical way: we showed that the linear functionals in any space V 
could be regarded as vectors in its dual space P, but we did not show that 
the vectors in V could be regarded as linear funotionals on P. 


vectors 
inv 


linear 
functionals 
onv 


linear 


In the present sub-section, we study the linear functionals on P and see 
whether they do correspond in a natural way to vectors in F. 


In the case of finite-dimensional spaces, the matrix representation makes it 
very plausible that such a correspondence does exist. We have seen that 
every element of P (that is, every linear functional on V) can be represented 
as a one-row matrix, say [b, ... b,], where » is the dimension of V and P. 
A linear functional on Vis therefore equivalent to a function from the set of 
all ordered n-tuples of the form [b; ... 8,] to the field of scalars F. Such a 
linear functional will be of the form 


[by --- by] > yy + +46, (lbs --- balE P) 


where a@,,...,q@, are scalars, and it can therefore be written in matrix 
notation as 
lb; ... b)] > [By ... bal [a 
: a) 
An, 
The scalars a,,...,@, characterize the mapping, and in this way every 


linear functional on Ŷ can be represented by a one-column matrix fa, 
Qn, 
We have seen already, however, that the vectors in V are represented by 


one-column matrices. Thus the correspondence we are looking for can be 
set up via the one-column matrix representation. 


linear 
functionals 
onV 


column 
vectors 


23 


The above argument, however, is open to the objection that it makes use of a 
particular matrix representation; since matrix representations depend on the 
basis used, it is not clear whether the correspondence we have set up be- 
tween the vectors in V and the linear transformations on P depends on the 
basis or not. In fact, this correspondence does not depend on the basis, 
and that is the reason for its importance. But to prove this independence 
we must find a basis-independent way of describing the correspondence. 
Just as we have done several times before in this unit, we can get a strong 
clue to how this correspondence comes about by looking at the way the 
matrix representation works out. What we have been saying essentially is 
that in the expression 


[b ba] a 
4, 
we can either regard the as as fixed and the bs varying, or the bs as fixed 


and as varying. It can be thought of either as an image under a function 
specified by 


ay 

a, 
with domain the set of all [b, ... b„], or as an image under a function 
specified by [b, ... 8,], with domain the set of all 

ay 

a, 
It’s purely and simply a question of swapping the roles of function space 
and domain. Whether we can carry out this swap in a mathematically 
satisfactory way, depends solely on our ability to juggle with the concepts 
of “ function ” and “ domain ”, and not at all on the nature of the particular 
function space and domain being considered. 


Example 


John, Jack and Jill get the following percentages in their end-of-term school 
exams, for English, Maths and Science. 


John Jack Jill 


English 62 50 60 
Maths B 40 60 
Science 61 49 60 


This table can be read in two ways. The headmaster might be interested in 
how John, Jack and Jill are doing in their various subjects. For example, 
he would regard “ John” as a function whose domain is the set of subjects: 


John : English ~——> 62, etc. 


An educational psychologist, however, might look at the above figure 
rather differently. He might wish to compare the teaching success of the 
English, Maths and Science teachers. To do this, he would perhaps regard 
“Maths”, for example, as a function whose domain is the set of pupils: 


Maths : John -——+ 78, etc. 


Our object is to find a concise mathematical description of this “ swapping” 
process. If we treat the two sets (English, Maths, Science) and (John, Jack, 


24 


Jill) on a par, then the above table is really describing a function on the 
Cartesian product: 

F: (English, Maths, Science) x (John, Jack, Jill) —— R 

Thus, 


F((English, Jack)) = 50 
F((Science, John)) = 61 


etc. 


Essentially, we have a similar situation to the one we had in Unit M100 19, 
Relations, where we saw that a function ff: A ——> B had an alternative 
mathematical description as a relation on theset A x B. Here, too, we have 
an alternative mathematical description of a situation: if S is a set of 
functions with domain T and codomain R, then we can alternatively des- 
cribe the situation as a function F:S x T ——> R, where 


Fx) =f) (feS,xeT) 


or again as a set T of functions with domain S and codomain R. (You can 
pronounce T, “ T—tilde” or “ T—twiddle’”.) 


S functions T functions 
= SxT 
T domain S domain 
feS,xeT (f,x) € SxT XeT,feS 
f(x) = F(f;x) = X(f) 


We write T instead of T if we are thinking of T'as defining a set of functions, 
and ¥ instead of x (x € T) because an element of T, the domain of S, is not 
exactly the same thing as a function with domain S. There is, rather, a 
natural correspondence between x and ž, given by 


Rf) =f) eT, feS) Q) 


Equation (2) is a most important equation, as it expresses this particular 
aspect of duality : “ function space ” and “ domain ” are dual concepts, and 
can exchange roles by means of Equation (2). In terms of our example, 
we have the following situation. If (John, Jack, Jill) is the function space, 
and (English, Maths, Science) the domain, then one has functions: 


John(English) = 62, John(Maths) = 78, etc. 
Jack(English) = 50, etc. 


We know that English, Maths and Science can be regarded as functions, 
and we put ~ above them to distinguish between their role as functions and 
their role as elements of a domain: 


English(John) = 62, English(Jack) = 50, etc. 


Exercise 


Let T be the set of all Open University Students. Define a set S of functions 
with domain T and codomain R, corresponding to the set of counties in 
Britain: 


S = (Yorkshire, Lancashire, Hertfordshire, ...) 


where 
Yorkshire(x) = 1 if x lives in Yorkshire 


25 


and 


Yorkshire(x) = 0 if x does not live in Yorkshire, etc. 


If Montague Z. Delacourt-Ponsonby lives in Lancashire, describe the 
function 


(Montague Z. Delacourt-Ponsonby)~. 


(The ~ outside the parentheses means the same as one above the element.) 


Solution 


The domain of the function is the set of all counties, and the images 
are given by 
(Montague Z. Delacourt-Ponsonby)” (Lancashire) = 1, 
(Montague Z. Delacourt-Ponsonby)”() = 0 


(tS, t # Lancashire). 


12.2.2 Duality and Linear Functionals 


The above account of exchanging the roles of function and domain may 
well generate the comment: “so what ?” It does not, as it stands, appear to 
lead very far. However, when we impose a definite structure on the sets S 
and T, we get some interesting results. In particular, let us look at the case 


where the domain, T, is the vector space V, and the function space, S, is the 
dual space, P. 


V functions Fav 


V domain 


pef, aeV 


¥ functions 
V domain 


aeV pe? 


(o,a)e VxV 


$(a) 


F(g,a) 


alp) 


e tee, 


For any vector a e V, we get a function & with domain P 
What makes duality “tick” as far as vector spaces are concerned is this, 
P is a vector space and so there is no reason why we should not form its 
dual space—the space of all linear functionals on P. We write P (pro- 
nounced “ V double hat” or V double cap ”) for this space. The Temarkable 
thing is that & turns out to be an element of V, ie. a linear functional on 
P. We ask you to prove this fact in the next exercise. 


and codomain R, 


Theorem 4 

For any «e V, define the function 
&: V— + R, 

by 
ap) =p) (heP). 


Then & is a linear functional on P. 


26 


Exercise 


Prove Theorem 4. 


Solution 


We know that ġ(«)e R, and so & certainly maps P to R. It remains 
to show that & is linear, i.e. that &ap + by) = ak(p) + ba), for 
a, beRand $, ye ?. 
Gad + by) = (ad + bY) (by the definition of 
a) 
= (ad)(@) + (bY) (by the definition of 
addition of functions) 


= apla) + byla) 
= ač(ġ) + b&w), as required. 


We have just seen that & is a linear functional on P. That is to say, it isan 
element of Î. So the Vin the figure above is really a subset of 7 


Exercise 
(i) Leta = (1, 2), $ = [3 4]. Calculate a4). 
(ii) Calculate &([a b]) for any a, be R. 
(ii) Let £ = (5, 6). Calculate B([3 4)). 
Gy) Calculate (20 + 38) ([3 4). 
(v) Calculate (20 + 38) ([a b]) for any a, be R. 


Solution 
G) &4) = $@ = [3 4] B 
=11. 
G) &({ab)) = [ab] [3] 
=at+2b. 
Gi) B(B 4) = B 4] k] 
= 39. 
(iv) (2a + 38)" ((3 4) = [3 AG Al + RN 
rapa] 
= 139. 
O) (Qa + 38)" (lab) = [a ae] + T 
en] 
= 17a + 22b. 


Although in the above exercise we did not show that (2% + 38)” = 28 + 3B, 
it is in fact the case, as we shall see in the next sub-section. 


27 


12.2.3 Is Ÿ the whole of Ñ? 


We have seen that 7 is a subset of Ê. When Vis finite-dimensional, we can 
establish the fact that 7 = Î by showing that they have the same dimen- 
sion, and invoking Theorem 4.7 on page N22.* 


The easiest way to show that dim Ñ = dim Pisto show that dim 7 = dim V; 
for we can apply Theorem 2 twice to get the result 


dim V = dim P/= dim Î. 


We have a mapping from V to P that takes « to &. If we label this mapping 
J, we have 


Jiar-— & (@eV) 
where 
&¢)= 46) (peP) 


and Vis the image set J(V). If we can prove that J is a linear transformation, 
then we can use the Dimension Theorem (Theorem 1.6, page N31): 


dim J(V) + dim K(J) = dim V 
where K(J) is the kernel of J. If dim K(J) turns out to be zero, then we have 
the result we are looking for, and we can assert that dim 7 = dim v, and 
that J is an isomorphism of V onto p. 
Theorem 5 
If dim V is finite, then (i) dim F = dim Î, and (ii) J is an isomorphism of 
V onto 
Proof 
We first show that J is a linear transformation, i.e. that 

Jax + bp) = aJa) + bI(B) 
for any vectors «, $ e V and any scalars a, b. 


To calculate J(ax + bf), we calculate its effect on an arbitrary element ¢ of 


(laa + DPG) = (aa + bf)" (9) 
= d(aa + bf) 
= apla) + 59(8) 
= aã(p) + BB) 
= AKANE) + BUANG). 
This is true for all ġ e P, so 
Jaa + bB) = aKa) + bB) (a, BER a, Bev). 
Next we show that K(J) = 0. 
Suppose J(«) = 0. Then 


ANG) = 0 for all be V 
&($) =O for allde VD 


pa) = 0 for all ġe? 


* This Theorem has not been covered in a readin 


proof. & passage, but you should be able to follow N's 


28 


But ifa # 0, it can form the first element, «,, of a basis of V, in which case 
if ¢, is the first element of the dual basis, then ¢,(a,) = 1. Thus, since 
$(a)= 0 for all de P, we cannot have a # 0. That is, 


a=0. 
Thus K(J) = 0. 


We can now establish part (i) of the theorem, Since K(J) = 0, dim K(J) = 0 
and the Dimension Theorem yields, since Ÿ = J(V), 


dim 7 = dim V 
and hence (applying Theorem 2 twice) 
dim F = dim P. 


To establish part (ii) we use Theorem 4.7 on page N22, which implies: 
Theorem 6 


V is the whole of P (i.e. Jis onto P) if V is finite-dimensional. 


Hence, since K(J) = 0, J is one-to-one, and so J is an isomorphism of V 
onto P. The mapping J is just the correspondence between linear functionals 
on Pand vectors in V which we set up using matrices in sub-section 12.2.1, 
but now it has been defined in a basis-independent fashion. J is, in fact, 
completely basis-independent, i.e. it is a natural isomorphism between V 
and P. It resembles other basis-independent concepts, such as dimension, 
in this respect. There are, of course, any number of isomorphisms between 
Vand Î, and for that matter between V and P, as they all have the same 
number of dimensions. But J stands out from these, in a way in which no 
other isomorphism from V to v and no isomorphism at all from V to P, 
does, in being defined in basis-independent terms. 


Exercises 


l. Exercise 1, page N134, (Hint: have you solved a problem like this 
before ? Look at Exercise 5, page N132, which you did as Exercise 3, in 
sub-section 12.1.3.) 


2. List some intrinsic, i.e., basis-independent, properties of linear trans- 
formations that we have met in this course. 
Solutions 

1. This is the dual of Exercise 3 of sub-section 12.1.3. Thus: 


Solution of 
the present problem 


Solution 3, 
sub-section 12.1.3 


n 
Let a = 2, ahr 
i= 


gozodun) | s=} at) 


= £ a; bda;) = £ ajha) 
E ia 
= > a;54; 
J51 
=a; =a; 


That is, $,(«) = a; 


Thus, « = $ at; 
i=] 


= $ bom. 


That is, (æ) = a. 


Thus, ġ = Lagi 
51 


= X bed. 


29 


2. The major intrinsic properties we have met are as follows. 
In Unit 2, Linear Transformations we met the three “ vital statistics ” 
of a linear transformation: dimension of domain, rank (dimension 
of image space), dimension of codomain, which are intrinsic; so is 
the nullity (dimension of the kernel), which is dependent on the 
first two above, the dependence being expressed in the Dimension 
Theorem. If the transformation is an endomorphism, it has further 
intrinsic properties: its eigenvalues, invariant ‘subspaces, and the 
related properties introduced in Unit 5, Determinants and Eigen- 
values such as eigenvectors, eigenspaces, characteristic polynomial. 
Other intrinsic properties are those discussed in Unit 10, Jordan 
Normal Form; for example, Jordan normal form itself. There are 
many other properties with which by now we are very familiar; 
€g., the image of a vector space is a vector space, the image of the 
zero vector is the zero vector, the kernel is a subspace of the do- 
main, etc. 


The intimacy of the relationship between V and Pis further revealed when 
we look for a basis in Î. For example, if V is two-dimensional (and there- 
fore so are P and P), suppose V, Pand P have bases: 

Asiana); Â={ġn 0}; A= bh 


where A is the dual basis to A and A is the dual basis to 4. Then we can 
find A in terms of A as follows. 


The definition of dual basis (sub-section 12.1.3) gives 

Hed niele Sea] © 
and 

nepsoni agaz o 
We have 

ilp) = pa) (peP) (3) 
Putting ¢ = ¢,, then $ = ¢2, in Equation (3) gives 

(9) = Pi) =1, lh) = p) =0 
Similarly (4) 

Tlp) = $12) =0, (H) = b2(02) = 1 


Comparing Equations (2) with Equations (4), we see that &, = as they are 
equal on the elements {¢,, $2} of a basis of P, and similarly & = 4, . That 
is to say, 


tA, Ay} = {čo õa} 
Â= {či @} 
= {J(@,), Ho). 
To put it another way, with respect to bases A of Vand A of i the matrix 
Tepresenting J is the identity matrix. This is true whatever basis A we start 


off with, and is another way of looking at the “ naturalness” of the cor- 
respondence J between V and P. 


30 


12.2.4 Summary of Section 12.2 


In this section we defined the term 


natural isomorphism (page C29) 


Theorems 


4. (page C26) 

For any we V,let &($) = $(«) for all $ e P; then disa linear functional on P. 
5. (page C28) 

If dim Vis finite, then (i) dim 7 = dim Band (ii) Jis an isomorphism of V 
onto 

6. (page C29) 

V is the whole of Îif Vis finite-dimensional. 


Notation 


a (page C26) 
(page C26) 
V (page C26) 


READ Section IV-2, starting on page N133. 
Note Paragraph following the proof of Theorem 2.1 on pages N1334. 


What Nering means here is that in the infinite-dimensional case the concept of 
continuity of a function over V is very much more subtle than in the finite- 
dimensional case. Several different definitions can be given of what it could mean 
for such a function to be “continuous”, and the particular definition chosen 
depends on the exact use to which the infinite-dimensional space is being put. 
Tf one restricts the definition of the “dual” of such a space by saying that it 
consists of all continuous linear functionals, then the definition of continuity will 
affect the size of the dual (and hence of V the double dual); some definitions will 
make the double dual equal to the original space, and others will not. 


31 


12.3 DUALITY IN ACTION 
12.3.0 Introduction 


Our discussion in the Introduction to the unit of greengrocers and their 
customers was designed to help you to understand what a dual space is, 
rather than to show you the idea of duality being used. This unit will not 
help you next time you go shopping, and you may well be asking “ what 
use is the concept of duality to anybody who actually uses mathematics ?” 
In the first two sub-sections of this section we will try to give you the be- 
ginnings of an answer to this question, and to show you how mathematical 
ideas can be just as useful to the development of a science as mathematical 
calculations. Both these sub-sections are optional; the second, on the 
“delta function”, is one that you may find rather difficult, and you may 
wish to come back to it for interest when you have some time to spare later 
in the course. 


Sub-section 12.3.3 returns to the pursuit of mathematics central to the 
development of the course, and discusses the concept of an annihilator. An 
annihilator is really a solution set of a system of homogeneous linear 
equations, but new insight is gained by looking at it from the standpoint of 
duality. This is an important section, as the material in it is used later in the 
course, in Unit 15, Affine Geometry and Convex Cones. 


12.3.1 Factor Analysis (Optional) 


We take our first example from psychology. Here even more than in the 
physical sciences, linear models are no more than approximations, often 
very rough approximations. 


The pioneers of the kind of psychology that attempts measurements on the 
natures of living organisms rather than speculations on the Nature of Life, 
were probably Pavlov and Watson, who used the “stimulus-response” 
model. That is to say, they considered animals (and people) as things that 
respond to external stimuli, in a predictable way. 


stimulus response 


In mathematical language, the animal may be considered as a function 
whose domain is the set of all stimuli, and whose codomain is the set of all 
responses. However, it is rather difficult to see how to give any mathe- 
matical structure, in this model, to "the set of all stimuli”; and no con- 
sistent account was taken of the differences in response between different 
animals (or people). In fact, Watson assumed that there were no inherent 
differences between animals of a given species (e.g. people), and that all 
observed differences consisted merely of different ways in which the en- 
vironment had forced the various possible stimuli and responses together, 
to form “conditioned responses”. 


A major advance was made during the first half of this century, with the 
recognition that (especially where people are concerned) differences be- 
tween one person and another are of great importance, and could be 
quantified. The statistical techniques of factor analysis* allow these differ- 
ences to be expressed within a vector space model. The idea is that various 
factors (intelligence, extraversion, tenacity, etc.) vary independently from 
person to person, and by a series of tests it is possible (within a certain 
degree of accuracy) to measure people on the various scales. But how are 
we to decide exactly what to measure? The answer is that we construct a 
wide variety of tests, record a numberof people’s scores on these, and analyse 


* The following book contains a good discussion of factor analysis: D.N. Li 
Maxwell, Factor Analysis as a Statistical Method (Butterworth, 1963). ` ey Are 


32 


the results to see just how many dimensions the vector space needs to have, 
in order to account (again, to within a specified degree of accuracy) for the 
variation observed in the scores people obtain. 


The model that is used assumes that a person’s performance on a given 


test will depend linearly on the degree to which he possesses various factors. 
For instance, if the test is his ability to play tennis, it might be found that 
this ability (4) depends to a great extent on the amount of practice he has 
had (P), to a lesser extent on his intelligence (J), and to a lesser extent still 
on his “tenacity” (T). We might have 


A=0.5P + 0.31 +0.27 


so that ability to play tennis is a linear functional, with matrix represen- 
tation 


[0.5 0.3 0.2], 
on the vector space describing people. 


Having worked outa consistent way of measuring people, we can “ measure” 
new “tests” against the vector space which measures the people. Ability 
to fly a lunar module, for instance, might require a great deal of tenacity, 
slightly less intelligence, and make no demands at all on how much practice 
a person has had at tennis. Perhaps it would correspond to the matrix 
[0 0.7 0.8]. In order to discover the various factors required, one would 
simulate a lunar module and test various people whose characteristics 
(intelligence, etc.) were already known. In effect, one would be working 
out how to place the test of “flying a lunar module” in a vector space of 
possible tests, and one would be regarding people with various levels of 
measured intelligence, tenacity, etc., as linear functionals on this vector 
space. 


We thus have a complete duality between the characteristics of people 
and the situations in which they are placed. If we postulate a certain 
mathematical structure for one, we must postulate a dual structure for the 
other, and each is equally important as an object of study for the psycholo- 
gist. 


33 


12.3.1 


12.3.2 The Delta Function (Optional) 


This will probably be quite a difficult example to follow, but well worth the 
effort. It shows that the idea of a function can be generalized to include a 
class of objects that are not themselves functions, but may in a sense be 
“limits” of infinite sequences of functions. These generalized functions 
first arose in the work of the theoretical physicist, P. A. M. Dirac, in 1929, 
but it was not until 1945 that the mathematician L. Schwartz showed that 
they could be put on a rigorous footing using the concept of a linear 
functional. (There is a brief discussion of the same idea at the top of page 
K222.) 


This generalization comes about as a result of the fact that we can map a 
space of functions, such as C[0, 1] (the vector space of real-valued con- 
tinuous functions on [0, 1]), into its dual space, by means of integration. 
That is to say, corresponding to every function g in C[0, 1], we can define 
a linear functional L, on C[0, 1], by 


LS — [rox at (fecto, 1). 


We should check that L, is really a linear functional. The domain and 
codomain are certainly right; so we have to make sure that L, maps the 
domain (C[0, 1)) /izearly to the codomain R. This follows directly from the 
rules of integration (see Unit M100 13, Integration ID): 


Llaf, + Bf) = ik af, + BF MOE(0 at 
e f FLOR) + BEDRO) di 


1 
a froo ditb Í AOLO 


= aL,(f,) + 6L,(f2). 


Now the mapping L which takes g to L; for every g in C[0, 1], is itself a 
linear transformation, from C[0, 1] to its dual. We can show this as follows. 
For every g in C[0, 1], L, is a linear functional on C[0, 1]. Thus the domain 
and codomain are right, and we have to make sure that L is linear. That is 
to say, let gy, g2 be arbitrary elements of C[0, 1] and a, b arbitrary real 
numbers; then we have to show that 


Lg. +092 = Lg, + bE g, 


To show this, we compute the effects of Lagi +bg, ON an arbitrary element 
fof CU, 1). 


Lg tins) = Moc + be:)(0) dt 
1 
z fioo + Bga(t)) di 


=a [roa d + of Koes a 


= aL, (f) + bL,,(f). 
Since this equality holds for all fe C(O, 1], we conclude that 


Logs +0g2 = Lg, + bL (815 8 €C[0, 1]; a, BER), 


Thus Ł is linear. 


34 


Now things do not go as smoothly in infinite-dimensional spaces as they do 
in finite-dimensional Spaces.* We have just proved that L is a linear trans- 


L is not an isomorphism of C[0, 1] with its own dual, and in fact there is 
no isomorphism of C[0, 1] with its own dual. So what goes wrong? 


The answer is that there are a vast number of elements of the dual of 


C[O, 1] that are not images of elements of C[0, 1] under L. Consider for 
instance, the mapping 


Pip f— f) Geco, 1p. 


This is the same sort of mapping as in Exercise 1(d) of sub-section 12.1.1, 
and is easily seen to be a linear functional. But there is no function g such 
that 


Ly = Py. 


To see this, try to imagine how the integral 
1 
[Aneto ae 


varies as f varies and g remains fixed, It is intuitively clear that, in any 
region where g takes non-zero values, the value of the integral will be 
affected by altering the values which f takes in that region. On the other 
hand, the value of P,;2(f) is affected only by the value of f at 4; f can take 
on any values it likes at other points without affecting the value of P,,.(f). 
On the other hand, P,,. is not entirely unconnected with functions in 
C[0, 1]; it is in some sense the imit of a sequence of functionals 


LL, 


Bue “gave 


where g,, g2,...€C[0, 1]. We can construct a suitable sequence 2), 2>,... 
by using the following idea. The value of Pi ,2(f) is affected only by the func- 
tion value at 4, so can we select functions 81, 22, -.. which narrow down 
the range of points in [0, 1] within which the value of f affects the value of 
LJ)? The answer is that we can do this, by requiring that the functions 
81, &2,--. differ from zero only on certain sub-intervals of [0, 1]. We can, 
for instance, demand that 


SX) =O R- 4427) 


and that g,(x) be constructed within the interval [4 — 27", 4 + 27"] in such 
a way that 


1 (1/2)+ 2-7 
Í g(x) dx = Í g(x) dx 0) 
o ( 


172)-2-" 
=1 
and 


g(x)20 (xel, 1). 


Some g, are illustrated on the next page. 


* Sce the last paragraph on page N133. 


35 


ag(x) 


i=] 
al 
alo 
nd 
alan 
alo 
= 
x 


Now what about L,(J)? We have specifically imposed condition (1) in 
&,(X) so that 


lim (LeS) =/@). 


n large 
Can you see why this should be so? 


The reason is as follows. If we let c, be the minimum value of f(x) in the 
interval [} — 27", $ + 27"], and d, the maximum value in this interval, then 


[ees dx < [sere dx < fiasc dx 
o 0 o 
ie. 

ba S Lp) <da. 


Now, because f is continuous, and because the width of the interval 
[4 — 27°, 4+27"] approaches zero, we must have 


lim c =G) = lim d,. 


n large n large 


Thus, 
lim LQ) =f% =P,,(f). 


n large 

(Of course, we have omitted a number of points of mathematical detail 
here, so as to give you a general idea of what is happening. To do it pro- 
perly would require much more time and space.) 


Thus lim L,, is a perfectly well-defined linear functional on Cfo, 1]. 


n large 


However, lim g, is not a function from (0, 1] to Rin any normal sense. If 


n large 


you look at the diagram again, you will see that 


lim g,(x)=0, when x #4, 


n large 


since the intervals on which the g, are non-zero rapidly become narrower 
and narrower, and eventually exclude the point x if x # t. 


36 


But 


lim g,(4) does not exist; somehow g,(4) “ goes to infinity ”, so if we wish to 


n large 


picture lim (g,) as a function, it must be a function which is “equal to 


ie large 

infinity” at 4, and zero everywhere else! Physicists often find such an idea 
useful as a mental picture, but clearly such an idea is not mathematically 
satisfactory. To be mathematically sound, we should express P;,2 simply in 


terms of its effects on elements f of C[0, 1], which, of course, we can do 
perfectly well. 


So we see that we have generalized the notion of a function, and ended up 
in the dual Space to a space of functions. This is no academic exercise; as 
we have said, physicists in the first half of this century were faced with 
exactly this problem. 


To get an inkling of the sort of situation in which the problem arises, look 
back to sub-section 11.2.1 of Unit 41, Differential Equations TII. We saw 
there that a particular solution of the normal linear differential equation 


LO)=h 2) 


can be expressed by the formula 
x 
0) = f K(x, Ni) at O 
xo 


where K(x, f) is a certain function of two real variables. Equation (3) 
defines a linear transformation on the appropriate function space, under 
which the image of the function A is the function Yp. However, there are 
circumstances in which A is not a function in the usual sense, and yet a 
solution of L(y) = h is known to exist. If L(y) = h represents a mechanical 
system, for instance, with springs, weights, etc., then we might know that 
at a time to the system was banged with a hammer, setting it suddenly into 
motion. The motion of the system is determined by L and h, where A would 
then be related to the impulse that the system received through being banged. 
But the assumption that it was banged with a hammer would mean that 
A(t) would be zero whenever t¢ was different from to, yet the integral (3) 
would be non-zero. In fact, h would be a “ generalized function” of the 
sort we have just seen. Symbolically, a generalized function like the above is 
denoted by 6,, (the “ Dirac delta function”). It is a function of t such that 
6,.(t) is thought of as infinite at £ = tg and equal to zero everywhere else, 
but only makes sense when it comes after an integral sign, or as the non- 
homogeneous part of a differential equation. In the case quoted above, 
Equations (2) and (3) would become: 


L(y) = Òlt) (2a) 


n= f KODOK (3a) 


37 


12.3.3 Annihilators 


In this sub-section we will take a brief look at another example of duality 
arising out of vector space theory : the concept of an annihilaior. It is an 
important concept for you to grasp, as it is used later in the course (Unit 
15, Affine Geometry and Convex Cones). 


Definition 
If S is any subset of a vector space FV, then the annihilator of S is the follow- 
ing subset of Ñ (denoted by S+): 
St = {he P: pla) =0 for all «e S}. 
(SŁ is read as: “ S-perp”’.) 


This is a very broad definition: S can be any subset of V. In particular, S 
can be a subspace of V. 


Example 1 


Let V = R?, S = ¢(1, 0, 0), (0, 1, 0)), i.e. S is the subspace of V generated 
by the first two basis elements of the standard basis. Then: 

S+ = {¢ e P: O(a) = 0 for all « of the form (a, b, 0) e V}. 
Thus to find S+, we must look for the set of all [x y z] such that 


[xy z][a] =0 for all a, beR, 
b 
0 


ie. 
xa + yb =0 for alla, beR. (1) 


This condition places no restrictions on z, but clearly x and y must each be 
equal to zero for Equation (1) to hold for all a, be R. 


Thus, 
St ={[002]:zeR} 
= <[001)>. 


We can see here why we use the | (perpendicular) sign. The elements 
[0 0 z] are not in V, but if we ignore the type of brackets, the elements 
(0, 0, z) lie along the z-axis representation of R°, which is perpendicular 
to the (x, y)-plane representation of <(1, 0, 0), (0, 1, 0)>. But, in spite of 
the suggestion, we must remember that S“ and S are not in the same space. 


Example 2 


Let V = R? and let S = {(1, 0, 0), (0, 1, 0)}. That is, S is the set containing 
these two elements, not the subspace spanned by these two elements as in 
Example 1. 


Then ¢ eSt if and only if 


$((1, 0, 0)) = GC, 1, 0)) = 0. 
That is, [x y z]e S+ if and only if 


| = [x yz][0] =0, 
0 1 
al 


ie. if and only if x= y =0. 


38 


Once again, then, 
S* = {[0 0 z2]: zeR} 
= [00 I). 


In other words, although this time S is not a subspace of V, we still find 
that S* is a sub-space of P. 


Example 3 


Let V = R?, and let S = {(1, 2, 3), (1, 1, 1)}. Then [x y z]e S+ ifand only if 


oh =([xyz]f1] =0 
2 1 
ay 


1e., if and only if, 


X+2y + 3z=0 
x+ y+ z=0 


This should make it clear that an annihilator is really an old friend ina new 
disguise: it is the solution set of a system of homogeneous linear equations, 
ie. of a “homogeneous linear problem” (Unit 3, pages N63-4). The only 
difference is that we regard it as being a subset of Prather thana subset of 
V. You already know that the solution set of a homogeneous linear prob- 
lem is a subspace, so it should not be difficult to translate this knowledge 
into the language of the present unit, and fill in the gaps in the proof of the 
following theorem. 


Theorem 7 


If S is any subset of V, then S* is a subspace of P. 


Proof 
We have to prove that, for all a, be R, and ¢, y e S+: 
est 0) 
That is to say, we must show that if « is any element of S, 
( ) (a) = (ii) 
We proceed: 
( Xe) = (a) + yla) (ii) 
ee. See (iv) 
=s (v) 


and the theorem is proved. 
The lines with blanks should read as follows: 
G) abtbpyeSt (ii) (ad + bya) =0 
(ili) (ap + bY)(a) = apla) + byla) 
tv) =0+0 
(v) =0 


Another thing you will notice by comparing Examples | and 2, is that in 
this particular case the annihilator of {(1, 0, 0). (0, 1, 0)} is equal to the 
annihilator of the subspace which they generate. This is another result 
which is true in general, and is stated in the next theorem. 


Theorem 8 


If S is any subset of a finite-dimensional space V, then $+ = ¢S)*. 


39 


Proof 


Every element of ¿S+ is contained in St, since if ġ(«) = 0 for all a e <$), 
it is certainly zero for all «e S. 


Conversely, if ġ is any element of S+, then the following argument shows 
that it is also in ¢S)+. 


Let xe lS}; we have to show that ¢(«) = 0. Since <S) is generated by S, 
we can find elements a,,..., @, of S, and scalars a, ,...,@,, such that 


HH Aye, +o + aye, 
Then 
Pe) = Playa, + + Ak) 
= a,G(e) + + a, pla) 
=0 +++ +0, since ġe S+, 
=0, 


and the theorem is proved. 


Example 4 


Find a basis for the annihilator S+ of S = <(2, 4, —1, 1, 2), (1, 2, 1, 8, 10). 
By Theorem 8, the annihilator of S is the same as the annihilator of the set 
{(2, 4, — 1, 1, 2), (1, 2, 1, 8, 10)}. This is the set of linear functionals 
{a b c de} such that 


2a+4b—c+d+2e=0, 
a+2b+c+8d+ 10e=0. 


To solve this system of equations, we bring the matrix of coefficients to 
Hermite normal form; this results in the matrix 


1203 4 
001 5 6 


corresponding to equations 


a+2b +3d+4e=0, 
e+ Sd+ 6e=0. 


We can take b, d and e as arbitrary; this gives 


a= —2b — 3d— 4e 
c= — 5d — 6e 


Thus 
[a b c d el=[(—2b-—3d—4e) b (—5d— 6e) d e] 
=b[-2 1 0 0 0]+d[-3 0 -5 1 0] 
+e[-4 0 -6 0 1]. 
Thus the required basis for S* is 
{{-2 1 0 0 0],[-3 0 -5 1 0],[-4 0 -6 0 If. 


Numerically, then, this calculation is equivalent to finding the kernel of 
the linear transformation represented by the 2 x 5 matrix 


2 4 -1 1 2 
1 2 1 8 104° 
Exercises 


1. In Example 4 above, the dimensions of S and S+ add up to 5, the 
dimension of the original space. Does this always happen? Give a 
reason for your answer. 


40 


2. Calculate the set of all vectors «e V such that blæ) = 0 for all ġ eS+. 
How does this set relate to the annihilator of S+, (S4)+ = S449 (Hint: 


answer the second part of the question first. Then apply the Dimension 
Theorem to answer the first part.) 


Solutions 


l. Yes. The dimension of S} is the dimension of the kernel of the 
linear transformation from Rê to R? represented by the 


matrix 
f 4 -1 I 32 
1 2 1 8 10]. 


On the other hand, the rows of this matrix represent the basis 
of S, and so the rank of the matrix is equal to the dimension 
of S. By the Dimension Theorem (Unit 2), these two dimen- 
sions must add up to the dimension of the entire domain space. 


2. The annihilator of a subset of a space is in the dual of that 
space. Thus the annihilator of S* is in ¥, and is the set of all 
& in P such that &(¢) = 0 for all eSt, That is, if T is the set 
of all vectors «e V such that (a) =0 for all ġest, then 
S++ = X(T), where J is the natural isomorphism of V with 7 
discussed in sub-section 12.2.3. It is usual, though, to ignore 
the isomorphism J altogether and regard V and Pas identical. 
Thus, we denote by S++ the set of allae V such that g(a) =0 
for all ge S+. 


Using Theorem 8, we see that to calculate S44, we need to 
find the set of all «e V such that ¢,(@) = h(a) = 3(«) = 0, 
where {$1, 62, 3} is the basis of S+ which we found in 
Solution 1. That is, we must find the set of all {v, w, x, y, z} 


such that 
—2v+w =0, 
~—3v —Sx+y =0, 
—4p —6x +z=0. 


Now we could find a basis of S++ by reducing the coefficient 
matrix of this system to Hermite normal form, but we can 
get the answer more quickly by using the Dimension Theorem 
in exactly the same way as in Solution 1. In other words, $+ is 
3-dimensional, so S++ must be 2-dimensional. Thus if we find 
two linearly independent elements of S!+, they must form a 
basis for S++. 

What about our original basis of S? This was {(2, 4, — 1, 1, 2), 
(1, 2, 1, 8, 10)}. Because S+ annihilates S, it must be the case 
that ¢(«) = 0 for all @eS", if a is either (2, 4, —1, 1, 2) or 
(1, 2, 1, 8, 10). Therefore, these vectors do indeed form a 
basis for S++. 


We finish this sub-section by writing out the results we have just discovered 
in a special case, in the form of theorems. 


Theorem 9 


If S is a subspace of a finite-dimensional space V, then dim S + dim S* = 
dim F. 


Proof 
Let S be k-dimensional, and V be n-dimensional. Let {a,..., 0%} be a 
basis of S ; then it can be extended to a basis{ay, ... , @ks @k+1 ++ +s Mok V. 


41 


Let {¢,,.-., ¢,} be the basis of P dual to {a,,..., ¢,}. Then any element d 
of Pis of the form 


$=) + aa dns 
and it is not difficult to check that ¢(«) = 0 for all «eS, if and only if 


@=a,='"''=a,=0. 
That is, $ e S* if and only if aj = a, = ++ =a, = 0, which is the same as 
saying that {ġk+1, »-., Pa} is a basis for S+. Thus, S+ is (n — k)-dimensional, 


and this proves the theorem. 


Theorem 10 


If S is a subspace of a finite-dimensional space V, then S+ = S. 


Proof 
IfaeS, then O(a) =0 for all ġ e S+, and so ae St". Thus, Sc S++, 
But by Theorem 9, 
dim S + dim S+ = dim V, 
and 
dim S+ + dim S++ = dim ? 
=dim V, 
and hence dim S++ = dim S. 


Therefore, S = S++ (see Theorem 4.7 on page N22). 


Exercise 


Exercise 3, page NI41. ~ 


Solution 
See page N333. 


12.3.4 Summary of Section 12.3 


In this section we defined the term 


annihilator (page C38) 


Theorems 


7. (page C39) 
If 5 is any subset of V, S+ is a subspace of P. 


8. (page C39) 
If S is any subset of a finite-dimensional space V, then S+ = <S>t. 


9. (page C41) 
If S is a subspace of a finite-dimensional space V, then dim S + dim St = 
dim V. ` 


10. (page C42) 
If S is a subspace of a finite-dimensional space V, then S+ = §, 


Technique 


Find a basis for the annihilator of a subset of V. 


Notation 
S+ (page C38) 
sit (page C41) 


READ Section IV-4, starting on page N138 as Sar as the end of page N139, 


42 


12.4 SUMMARY OF THE UNIT 


The aim of this unit is to introduc 
matical ideas, namely those conne 
als and duality. 


€ you to some powerful abstract mathe- 
cted with the concepts of linear function- 


In the first section, we introduced the idea of linear functionals and dis- 
cussed the structure of the space of all linear functionals on a vector space 
V, the dual space of V. We discovered that it is itself a vector space, of 
the same dimension as V and closely connected to it in structure. 


In the second section we looked at the meaning of linear functionals on the 
dual space Vand found that the dual space of P is naturally isomorphic to 
V itself. We also introduced the mapping J which establishes this natural 
isomorphism between vectors in V and linear functionals on P. 


The third section was used to illustrate the concept of dual space. The first 
example was a practical one in psychology—factor analysis. The second 
example wasa theoretical one leading to the definition of the Dirac delta 
function. The last sub-section dealt with the annihilator, i.e. the set of 
linear functionals on V which will annihilate a given subset of V. 


Definitions 
linear functional (page C5) 
dual space (page C10) 
dual basis (page C13) 
natural isomorphism (page C29) 
annihilator (page C38) 

Theorems 

1. (page C9) 


The set ofall linear functionals on V forms a vector space. 


2. (page C11) 

If V is n-dimensional, so is P. 

3. (page C18) R ae 

Let V be a vector space with basis A and let A be the dual basis in ?. If 
A’ is a new basis of V and the matrix of transition from A to 4’ is P, then 


the coordinates of A’ with respect to Â are the columns of Q, the matrix of 
transition from 4 to A’, and Q = (P73). 

4. (page C26) ee f 
For any xe V, let &() = (a) for all ġe VP; then & is a linear functional 
on P. 

5. (page C28) a . ON ; . 

If dim Vis finite, then (i) dim Ñ = dim V, and (ii) J is an isomorphism of V 
onto Î. 

6. (page C29) : . 

Vis the whole of Vif V is finite-dimensional. 


7. (page C39) 
If S is any subset of V, St is a subspace of P. 


8. (page C39) : , i 
If S is any subset of a finite-dimensional space V, then S* = ¢S>~. 


9. (page C41) i ; ee 
If S is a subspace of a finite-dimensional space V, then dim S + dim S% = 


dim V. 


43 


10. (page C42) 
If S is a subspace of a finite-dimensional space V, then S++ = S. 


Techniques 


1. Given a basis A of a vector space V, determine the corresponding dual 
basis A in the dual vector space P. 


2. Find a basis for the annihilator of a subset of V. 


Notation 


[abc...] (page C9) 
v (page C9) 

A (page C13) 
& (page C26) 
v (page C26) 
V (page C26) 
St (page C38) 
sH (page C41) 


We can draw up a table to represent the concepts and their dual concepts, 
that we have seen in this unit: this is another way to summarize the unit. 


Concept Dual 
Vector Space V Dual Space 7 
Basis A Dual basis Â 
Subspace Sof V Annihilator S+ in P 


* 


* 


12.5  SELF-ASSESSMENT 


Self-assessment Test 


This Self-assessment Test is designed to help you test quickly your under- 
standing of the unit. It can also be used, together with the summary of the 
unit for revision. The answers to these questions will be found on the next 


non-facing page. We Suggest you complete the whole test before looking 
at the answers. 


L @ Which of the following are linear functionals? Give reasons for 
rejecting those that are not. 


(a): G4, x2, x3) - E (Œi x2, x3) eR) 
x. 
| 


(b) o: (ey, x2, x3) ——> 2x; + 3x2 (Gy, X2 X3) € R°) 
©) 6: (4, x2, x3) — TX, (Gry, x2, Xa) E R°) 
(d) $: (Xin X2, X3) — x, +1 (Gx, x2, X3) ER?) 
(e) b:x-—+ (x, x, x) (xER) 

(©) b:x-——> 2x (xeR) 

O G:a) m 2, - 3x) (Gey) ER?) 


Gi) Write down the matrix representations, in the standard bases, of 
those of the above functions which are linear functionals. 


2. Is there always a natural isomorphism of U onto 0 : 
(i) if U is finite-dimensional ? 
(ii) if U is infinite-dimensional? 
3. Is there always a natural isomorphism of U onto Ô : 
(i) if U is finite-dimensional ? 
(ii) if U is infinite-dimensional? 
4. Complete the following: 
If {a,,..., n} is a basis for U and {¢,, ..., @,} is the dual basis for 0 
then 


Gi (a) = @j=l,...,n) 
5. {(1, 0, 0), (1, 1, 0), (1, 1, 2)} is a basis for R?; calculate the dual basis 
N 
in R’. 
6. Calculate a basis for the annihilator in R* of <(1, 2, 3, 4), (4, 3, 2, 1)>. 
{a1, «z, %3} and {a,, «2, a4} are two bases of R?; differing only in the 
third basis element, and if {,, 62, 3} and {, $2, $4} are the cor- 
responding dual bases, use the concept of an annihilator to prove that 
$3 is a scalar multiple of ¢3. 
8. A mathematically-minded landowner decides to start up a “ game- 
park” on his land, and wishes to calculate how many antelopes of 
various sorts his land can take. He consults 
(a) a book which tells how much land per animal various different 
species of antelope require; 

(b) a map of his land, to calculate how many acres he has at his 
disposal; 

(c) anemployment agency, to tell him how many men are available to 
work on the project. 


He then calculates: 
(d) a suitable population of antelopes for his game-park. 


Which of the quantities measured by (a), (b) and (c) can be said to be 
in the “dual space” corresponding to quantity measured by (d)? 


45 


Solutions to Self-assessment Test 


1. 


46 


G) The mappings in (b), (c), (f) and (g) are linear functionals. The 
mappings (a) and (e) are not because their codomains are not R, 
and the mapping (d) is not because (0) # 0. 


(i) (œ) [2 3 0] 


( [x 0 0] 
(f) [2] 
@ [-3/2 /2 0] 
(i) No 
Gi) No 
(i) Yes, the natural isomorphism: « —— &. 
(ii) No 
On 
We must invert [1 1 1] and read off the rows. This gives 
Oii 
00 2 


{fi -1 0),[0 1 —4] [0 0O 4)} as the basis. 

The annihilator is the set of all [a b c d] such that 
a+2b+3c+4d=0, 
4a+3b+2c+d=0. 


Bringing the coefficient matrix to Hermite normal form, an equiv- 
alent system is 


a—c— d= o) 
b+2e+3d=0. 

Thus an arbitrary element of the annihilator is 
cl -2 1 0ļ+a{2 -3 0 1}, 

so a basis is 
{U -2 1 OL -3 0 1p. 


The annihilator of <o,, «2X is one-dimensional. By the properties of 
dual bases, 


$3(41) = 0 = halaa), 


and 
$5(a,) = 0 = $5(a2). 


Thus, both $3 and $3 are in <g,, «2+, and they must therefore be 
scalar multiples of one another. 


The best answer is (a), since if he has (say) g gnus, / hartebeest, iimpalas 
and & kudu on his land, and if a gnu requires G acres, a hartebeest H 
acres, an impala / acres and a kudu X acres, then the total amount of 
land he must have in order for them to live healthily is 


[G H I K]f g] acres. 
h 
i 
k 


Unit 13 Systems of Differential Equations 


Contents 


13.1 


13.1.1 
13.1.2 
13.1.3 
13.1.4 
13.1.5 


13.2 


13.2.0 
13.2.1 
13.2.2 
13.2.3 
13.2.4 


13.3 


13.3.0 
13.3.1 
13.3.2 
13.3.3 


13.4 


13.5 


Set Books 
Conventions 
Introduction 


Systems of Differential Equations 


A New Linear Operator 
Normal First-order Systems 
The Solution Space 

The Wronskian 

Summary of Section 13.1 


Constant-coefficient Systems 


Introduction 

Real Eigenvalues with an Eigenvector Basis 
Real Eigenvalues with no Eigenvector Basis 
Complex Eigenvalues 

Summary of Section 13.2 


Applications 


Introduction 

A Double Mass Spring System 
Electrical Networks 

Summary of Section 13.3 


Summary of the Unit 


Self-assessment 


Set Books 


D. L. Kreider, R. G. Kuller, D. R. Ostberg and F. W. Perkins, An Intro- 
duction to Linear Analysis (Addison-Wesley, 1966). 


E. D. Nering, Linear Algebra and Matrix Theory (John Wiley, 1970). 


It is essential to have these books; the course is based on them and will not 
make sense without them. 


This unit is based on a reprint, which is contained in the flap inside the back 
cover of this text. The reprint is taken from: 

D. L. Kreider, R. G. Kuller and D. R. Ostberg, Elementary Differential 
Equations (Addison-Wesley, 1968). 


Conventions 


Before working through this correspondence text make sure you have read 
A Guide to the Linear Mathematics Course. Of the typographical conven- 
tions given in the Guide the following are the most important. 


The set books are referred to as: 
K for An Introduction to Linear Analysis 
N for Linear Algebra and Matrix Theory 
The reprint is referred to as S. 


All starred items in the summaries are examinable. 


References to the Open University Mathematics Foundation Course Units 
(The Open University Press, 1971) take the form Unit M100 3, Operations 
and Morphisms. 


13.0 Introduction 


We have discussed on several occasions the representation of an electrical 


network by a differential equation using Kirchhoff’s Laws. 


constant current source 


For example, the network shown in the diagram is mentioned in Unit 4, 
Differential Equations I (Exercise 3, sub-section 4.2); it is described by a 
system of equations 


di, dq 
Ra = Ca =0 
iC) +h) = io (te Ro) 
P dq, 
ht) = E 


We could solve these equations by eliminating all but one of the unknown 
functions to obtain a single differential equation. In this unit we adopt a 
different approach which allows us to solve directly such a system of dif- 
ferential equations. 


The unit is divided into three sections. In the first section we describe what 
we mean by a system of differential equations, and having done that we 
use the analogy with a single linear differential equation to say what it 
means to find a solution and to consider the existence of solutions. We also 
show how a single differential equation can be considered as a first order 
system of differential equations. In the second section we shall look at 
constant-coefficient homogeneous systems and use the methods of eigen- 
vectors, eigenvalues and Jordan normal form to solve such systems. The 
final section deals with physical applications. 


This unit covers most of the material from a supplement.* This material on 
systems of linear differential equations by the authors Kreider, Kuller and 
Ostberg is written in the same style as K. If you are happy with this style, 
you may wish to read more on the subject than just the selection we treat in 
this unit. You will also find a discussion in Section VI.7 of N. 


We use the symbol S to refer to the Supplement, which is contained in the 
flap inside the back cover of this text. 


*This material is taken from: D. L. Kreider, R. G. Kuller and D. R. Ostberg, Elementary 
Differential Equations (Addison-Wesley, 1968). 


13.1 SYSTEMS OF DIFFERENTIAL EQUATIONS 
13.1.1 A New Linear Operator 
READ Section 5-6 pages S224~225. 


Notes 


(i) line 12, page S224 “in the preceding section”. This corresponds to the dis- 
cussion of linear problems in Unit 3, Hermite Normal Form (pages N63-64). 

(ii) Equation (5-21), page S224 This equation is not an application of matrix 
multiplication but a definition of how L operates on a column matrix X. The 
usual interpretation of a matrix as representing a linear transformation is not 
valid because the elements Li, are differential operators, and these do not form 
a field (e.g. D has no multiplicative inverse over C®[a, b]. 

(iii) line —15, page S225 “ operator equations ”. This is another expression for 
“linear problems” (see page K80). 


Exercise 


Exercise 2, page $225. 


Solution 
Let X, and X, be two vectors in U,, 4,, 42ER. Then if 
Xi = (Xii -3 Xai) 
LAX, + 4, Xz) 


Li ca Lin Axis + Agx12 
Lm e Lane ESET 


Ly sGrxiy + Ag X12) +0 + Lanan + Az Xna) 


EmiCArX 11 + Ap X42) ++ + Lina Ar%nr + Az Xna), 
Lyxyy bo t Lyn Xu 
=A, : 
LyX. $7 H Len Xn 
Lixa bo LinXna 
+a, : 


LmrXi2 3 + Lan Xyz, 
= A,LX, + A, LX). 


13.1.2 Normal First-order Systems 


If we want to solve some systems of e 
of system that we shall be looking at. 
order systems from now on. In fact, 
a restriction as it seems, because a 
order systems. 


quations we have to restrict the type 
. We shall restrict ourselves to first- 
we shall find that this is not as serious 
lot of systems can be reduced to first 


READ Section 5-7 from page S225 to the end of page $227. 
Notes 


(i) line —4 to the end of page S2: 

functions and derived functions) 
(D+ 1x. + Dra = hy 
(D — 1)xi— x2 = hz 


is of first order. This is the same as 


25. For example, the system (written in terms of 


Dxı + Dx: = =x, +h, 

Dx, =x txt h 
Solving for Dx, and Dx, we get 

Dx, =x, 4+x2+h2 

Dxz = —2x, —x24+h, — ha. 


We can usually rearrange a first-order system where the number of equations is 
the same as the number of unknowns in precisely this way into the normal form. 
Gii) fine 1, page $227 “ functional n-space” means Ue. 

(iii) line 8, page S227 What is meant here is that Xo € Z” is the n-tuple of images 
of fo under the functions x,, ..., Xa. 


(iv) line —1, page $227 Chapter 2 here corresponds to Chapter 3 of K. 
Exercises 


1. Exercise 1, page $231 
2. Exercise 2, page $231. 
3. Put the following system of equations into normal form. 


Dx,(t) + x2(t) + Dx,(t) = e 
x(t) + xÀ + Dx;(t) =0 
Dx2(t) + 2x3(f) = sin t, 
Solutions 
1. (a) First write out the system in terms of equations to get 
x(t) = x2(t), x2(t) = xal), 
x3(t) = 2x,(t) + xat) — 3x, (1) + e. 


Now use the first two equations to write everything in 
terms of x,(t) and its derivatives. 


x71) = 2x4) + x1 — 3x1) + ef 
Rearranging and writing x for x, 
x(t) + 3x"(t) — x(t) — 2x) = e'. 
(b) Ina similar way to that in (a) we obtain 
x") — g(x" — POx) = (0). 
2. Follow the procedure on page S226. 


1(t) 0 1l 0 x(t) 0 
= XD =| 0 0 1 x| + K 0 
x5(1) -e 0 —cost} [x(t P+ 
u(t 01 0 O7f x) 0 
ms a 00 Lt Off x(t) 0 
xA Tlo 0 o If x 0 
x(t) 0 0 ~e O}L x) cost 


3. Use the procedure in note (i) to the reading passage. Solve for 
the derivatives. 


Dx) = x) +e! 
Dx(t) = —2x;(t) + sin t 
Dx;(t) = —x,() — x2(1) 
so that 
Dx (À I 0 07 fx.) e 
Dx,{)}=| 0 O —2))}x,(t)] + [sint 
Dx;3(t) -1 -l of Lx) 0 


13.1.3 The Solution Space 


We turn now to the problem of trying to solve a system of linear differential 
equations. We proceed much as we did in Unit 9, Differential Equations I, 
where we discussed ordinary differential equations. 


Before solving specific systems, we first look to see if there are in fact 
solutions to be found and, if so, how many. In other words, do we have an 
existence and uniqueness theorem (page K104)? And what is the dimension 
on the solution space (page K106)? 


READ from the top of page S228 to the end of the statement of Lemma 5-1, 
and then the statement of Theorem 5-12 on page S229. 


Notes 
(i) line 4, page S228 If we let D be the differentiation operator from U, to W, 
D:X — Xx’ 


then X’ = AX is equivalent to (D — A)X = 0. Thus the solution set is the kernel 
of the linear operator D — A acting on U,. 

(ii) Theorem 5-11, page S228 This theorem can be stated in another way 
using an idea from Unit 9. Let W be the solution set of the system of equations 
considered, and let E denote the function from WÙ to 2" defined by 


E:X+-——> X(t). 


Then the theorem states that E is one-to-one (since the X for a given Xp is 
unique) and onto (since this X exists for all Xo). In the homogeneous case W is a 
subspace and Æ is a linear transformation, so that it is a (vector space) isomor- 
phism. We call it the initial condition isomorphism. 

Cii) line 17, page S228 Chapter 2 corresponds to Chapter 3 of K. Chapter 9 
material is covered in Unit 33, Existence and Uniqueness Theorem for Differential 
Equations. 

(iv) Lemma 5-1, page S228 Xi(to), ..., Xe(to) are vectors in A" since each X, 
is an n-tuple of functions and each X,(fq) is the n-tuple of values of these func- 
tions at fo. This lemma follows directly from the existence of the initial 
condition isomorphism. 

(v) Theorem 5-12, page $229 This also follows directly from the fact that E is an 
isomorphism. Since an nth order differential equation is equivalent to an n x n 
first-order system, the corresponding result for the solution space is a particular 
case of Theorem 5-12, 


Exercise 


Let 
A=[ 0 2 B(t) = [cos t 
—2 0 2sint 
(i) Which of the following vectors in U, are solutions of X’ = AX? 


(a) le A (b) T cos al (O) l sin z] 
sin sin cos 
(d) F sin | 


$ cost 


(ii) Show that X(/) = le | is a solution of X’ = AX + B. 
0 


(ii) Find the complete solution of the system X’ = AX + B. 
Solution 


Ci) We try each in turn. For example if 
X() = Pa al > X()= -f2 sin 2¢ 
sin 2t 2 cos 2t 
= 2| —sin 2f 
cos 2t 
AX(i) = [ 0 2][cos 24] =f 2 sin 2¢ 
—2 O||sin 2¢ —2 cos A 
= gl sin 2f 
—cos 2t 
Thus X’ 4 AX, and so this X(t) is not a solution. Similarly, 
(b) and (c) are solutions; (d) is not. 


Gi) XM= le | and AX(t) = [ 0 al k 1] = [ 0 ] 


—2 Of] 0 —2sint 
Therefore 
AX(t) + B(t) = [ 0 +f cos 1] = [cost 
—2sint 2sin t 0 


Hence, X’ = AX + B, if X(t) = [sin t]. 
0 
(iii) We must 
(a) find one particular solution 
and 


(b) solve X’ = AX 
(a) has been answered in part (ii), 
X(t) = [sinf 
0 
is a particular solution. 
(b) To solve X’ = AX, find a basis for the solution space 
Theorem 5-12 tells us that we need two basis vectors 
In part (i) we found two solutions for X’ = AX: 
X(t) = [ —cos 2t X(t) = [sin 21 
sin 2t cos 2t 
These will form a basis for the solution space if they are 
linearly independent. To test for linear independence 
we use Lemma 5-1. Let to = 0, then 
X,(0) =[-1], X,(0) = [0]. 
0 1 
Since these vectors are linearly independent in R°, X, 
and X, are linearly independent in V, . 
We can now write down the complete solution. Every 
solution of X’ = AX + B has the form 
X(t) = AP—cos 24] + pf sin 2r] + [sine 
sin 2t cos 2t 0 
= [sin t —Acos 2t + y sin 2t 
Asin 2t + u cos 2t 


where A, u are real numbers. 


13.1.4 The Wronskian 


The solution space of an n x n first order system of equations has dimen- 
sion n, and if {X,, ..., X,} is a basis (each X; being an n-tuple of functions) 
then the general solution is ),", c,X;. In general, given » solutions 
X,..., Xa, we must have some means of determining whether they are 
linearly independent. As we saw in the Exercise of the previous sub-section 
the method is to use Lemma 5-1, i.e. we choose a suitable point tọ at which 
we define an initial condition isomorphism E and consider whether 
EX,, ..., EX, € R" are linearly independent. This is equivalent to demand- 
ing that the determinant D(EX,, ..., EX,) # 0, for any fg in the interval J. 


We extend this idea by defining an initial condition isomorphism Æ, for 
each point te J. Then the real function with domain J defined by 


tr— D(E,X,..., E,X,) = Xu) > xu) 
Xa) O Xma(t) 
where X; = (x1;,-.-,%ni)» is called the Wronskian and is denoted by 


WIX,,..., X,]). 
READ from line —5 on page $229 to line 12 of page S230. 


Notes 


(i) line J, page S230 There is nothing unusual about the determinant. All 
that we are doing is expressing the matrix elements in the form of the image of 
a function. 

Cii) dine 5, page S230 “if and only if” Both these conditions follow immediately 
by the isomorphism between W and 2", 

(iii) line 11, page $230 The Wronskian is defined in Section 3-6 of K. The 
system of equations which was introduced ad hoc in Unit 9, sub-section 9.2.2, 
is the basis of the idea. 


Exercises 


1. (Ù) Show that X,(1) = p and X,(t) = [al are solutions of 


2) et 
X' = AX (te R), 
where dA =[ 2 —~1 
-2 1}. 


(ii) Determine the Wronskian of these two solutions. 
(iii) Characterise the solution space of X’ = AX in terms of X,and X,. 


2. (i) Write (D? — 1)x = 0 as a normal system in matrix form. 
(ii) Use the Remark on page $230 with the differential equation 
(D? — 1)x = 0 to obtain a basis for the solution space of the 
system you obtained in part (i). 


Solutions 


L. © X.@= p ; therefore X; (t) = lol 


2 
Pia 


AX,(t)= [ 
X,() = [-2] ; therefore X3() = [aa] 
| 


3e?! 
2 =I] f~] = f[-3e] = xr 
2 e ee 


10 


Gi) WEX,, X,] is given by 


t ——— |] -e 


3 x =3e" 40 (teR) 
g 


Gii) By part Gi), X, and X, are linearly independent. Thus 
the solution space is 


{X:Jde,,c.eR such that X = ¢,X + ¢, X3}. 
2 © DX=fo 11x 
10 


(ii) The Solutions of a differential equation with constant 
Coefficients are t> e* where A satisfies the charac- 
teristic equation. In our case we require 4? — 1 =0, i.e. 
A= +1. This gives the basis {e', e~} for the solution 
space of (D? — 1)x = 0. The Wronskian is (see Unit 9) 


Wie, e]: tije et 
e —-et 


By the Remark this is the Wronskian for a basis of the 
solution space {X: DX = |: ic Thus a basis is 


1 0 


il’ Ll") 
13.1.5 Summary of Section 13.1 


In this section we have defined the terms 


system of linear differential equations (page S224) 

homogeneous system (page S224) 

nonhomogeneous system (page S224) 

operator matrix (page S224) 

normal first-order system (page S225) 

Wronskian (page C10) 
Theorems 


1. (Theorem 5-11, page S228) 
Let X' = A(X + BY) 
be a normal n x n-first order system of linear differential equations defined 


on an interval Z. Then if tọ is any point in J and Xo is any vector in 2”, the 
given system has a unique solution X = X(t) such that X(to) = Xo. 


2. (Initial Condition Isomorphism Theorem, page C8) 

Let W be the solution set of X’(t)= A(‘)X(t)+ B(t), teZ, and let 
E: W — F" be specified by E(X) = X(to), to € 7. Then E is one-to-one 
and onto. 


3. (Theorem 5-12, page S229) 
The dimension of the solution space W of any homogeneous n x n-system 
X' = A(X is n, the number of equations in the system. 


Technique 
Givena system of first-order linear differential equations, put itinto normal 
form. 


Notation 

x (page S224) 
V, (page S225) 
E (page C8) 
E, (page C10) 


1 
W[X,,.-..X], (page C10) 


trx 


13.2 CONSTANT-COEFFICIENT SYSTEMS 
13.2.0 Introduction 


In the preceding section we saw that a normal first-order system can be 
written in the form 


(D- A)X=B 


where D is the differentiation operator from U, to U, defined in sub- 
section 13.1.3. As this is a linear problem we immediately know that solving 
the system involves: 


(i) finding a particular solution X, such that (D — A)X, = B 
(Gii) finding the kernel of (D — A). 


As with a single differential equation there is no specific method for finding 
particular solutions, but we could develop a method of variation of para- 
meters in the manner of Unit 11, Differential Equations HI. This method 
will not be dealt with in this unit, though it is described in the remainder 
of Section 5-7 of S. If you have the time you might like to read it. 


In this section we discuss the second part of the problem: solving 
(D - 4)X=0. 


We know from the previous section that solving (D — A)X = 0 is pre- 
cisely the problem of finding a basis for an n-dimensional subspace of 
U,,. (U, is itself an infinite dimensional vector space.) As in Unit 9, we will 
find a basis for this kernel when we have a constant-coefficient system, that 
is when A is a matrix of constants. In this case we can apply the theory we 
have developed in Units 5 and 10, Determinants and Eigenvalues and 
Jordan Normal Form; we can diagonalize A or reduce A to Jordan normal 
form. 


Before you proceed you should make sure that you are familiar with the 
definitions of eigenvalue, eigenvector, and characteristic equation treated in 
Unit 5, and that you know how to find the eigenvectors of a 3 x 3 matrix. 
An alternative method for solving constant-coefficient systems is dealt with 
in Section 5-8 of S. We shall not be discussing it in this text. 


This section is in three parts. We first consider the simple case where A is 
diagonable over R. We then generalize to the case when the characteristic 
polynomial can be expressed in linear factors so that the Jordan normal 
form is a real matrix. Finally we study the case when the characteristic 
polynomial has irreducible factors, resulting in a periodic term as in 
Unit 9. 


13.2.1 Real Eigenvalues with an Eigenvector Basis 


We are considering the system X’ = AX where A is a constant matrix and 
we are looking for a n-tuple of functions X which is a solution of this 
system. If we had been considering a Single equation x’ = ax, then a 
solution would have been x(t) = e" and this suggests that a solution to 
X'= AX might be an n-tuple of exponential functions £ —— e" 
where the scalars 4 somehow characterize the matrix 4, 
READ Section 5-11 on page S250 to line —1I on page $252. 


Notes 


(i) Lemma 5-3, page S251 Because 
eigenvalues and eigenvectors for A. 


XA) = [ ee] = e*E, 
ene! 


A is a matrix of constants we can calculate 
The n-tuple of functions X, is the n-tuple 


e,e*" 


Gi) line 10, page $251 This line follows since X, = E,e*". 
(iti) line 14, page S251 Here we could have used Lemma 5- 
to=0. 

(iv) line 17, page S251 Eigenvectors for distinct eigenvalues are linearly inde- 
pendent, as we saw in Unit 5 (page K463, Theorem 12-1). 


1 on page $228 with 


Exercise 
Solve the system of equations 
Dx, = 5x, + 4x, 


Dx, = —X,. 


Solution 


The system may be written in matrix form as X’ = AX with 


A=[ 5 4]. 
[id 
|4—Al] =- 54+4= 0- 1I) - 4). 


Let E, = (e,, e2)* bean eigenvector corresponding to the eigenvalue 


1. Then 
Lt fe)“ B 
-1 -IjLe, 0 
and a solution is e, = 1, e, = — 1. 
ie. Ey = (1, — 1). 


Similarly (4 — 4)E, = 0 
Li JE] td 
-1 —4)le, 0 
which has a solution 
E, = (4, —1) 
Hence using Lemma 5-3 the general solution is 
c,E,e' +c, Eye, (c1, C2 € R). 
i.e. {E,e', E, e*} form a basis for the solution space. 


The method described in the reading passage and the exercise allows us to 
solve the matrix equation X’ = AX when the n x n matrix A has n distinct 
eigenvalues. The object of requiring ” distinct eigenvalues was to ensure 7 


* Recall that I-column matrices can be written (ay. a2,» -+ » 4n). 


linearly independent eigenvectors in R", which by Lemma 5-1 would yield n 
linearly independent eigenvectors in U, which form a basis for the solution 
space. In fact we know that it is possible to have an eigenvector basis for 
R” even when there are less than » distinct eigenvalues, in which case the 
theory of this sub-section remains valid. 

Theorem 


If the eigenvectors E,,..., E, of an n x n matrix A form a basis for R", 
then the general solution of X’ = AX is 


X() = GE + +++ +c, Epe 
where E; is an eigenvector corresponding to 4,. 


Example 
To solve X’ = AX where 


A=/[4 -1 -1I 
1 2 =l 
1 -=l 2 


we first find the eigenvalues of A. 


The characteristic polynomial of A is, as you can check, |A —Al| = 
—(A — 3) — 2). The eigenvalues satisfy | A — AZ| =0, i.e. 4 = 2, 4 =3. 
The eigenvector E, = (e;, €2, 3) corresponding to the eigenvalue 2 must 


satisfy 
(A-2NE, = 0 i.e. [2 -1 —I]fe,] = [0 
1 0 -I)l/e 0 
1 =i 0J Les 0 


A suitable choice is E, = (1, 1, 1). Similarly (4 — 3/)E, = 0 


ie. 
1 -1 -I][fe,] = [0 
| ae i |: 0 
1 -1 -I)}|e; 0 
The solution space of this equation is two-dimensional and we can choose 
the basis {E$}, E?} where 


EP =(1, 1,0) and E® =(1,0, 1). 


We now have the basis {E, , EṢ}, EY} for R? and the general solution of 
X’ = AX is 


CyE, e? + (e EY + c3 EP )e* 

= (ce + (c3 + cg)e™, eye! + cze”, ce?! + ez e°’), 
Exercise 
Find a basis for the solution space of the system 


x, = 5x, — 6x, — 6x3 


X3 = —X, + 4x + 2x5 
x3 = 3x, — 6x, — 4x3 
(Hint:|5-’ —6 -6 |=-Q@-1)Q-2)?) 
-1 4-A 2 
3 -6 ~4-2 
Solution 


If X = [x,], we can write the system in the form X’ = AX with 
X2 
Xa 


A=[ 5 -6 ~ 
-1 4 2 
3-6 — 


14 


We proceed precisely as we did in the text and calculate the eigen- 
values of A. The characteristic equation is det (A — AI) = 0, i.e. 


Q-Da- 2)? =0, 


using the hint. The eigenvalues are 1 
eigenvectors. 


Ford=1: (A~Dfa]=[ 4 -6 -6 a| = [0 
a, -1 3 2| | az 0 
a; 3 -6 —5|{a,| lo 


and we can take E, = (3, —1, 3). 


Ford=2: (A~2)[b]=[ 3 -6 -6 fb] = fo 
|- 2 allel fo 
bs 3 -6 ~6lo,} lo 


There are two linearly independent eigenvectors E = (2, 1,0) 
and ED =(2,0, 1), so that X,() = Ge’, -e', 3e), XD) = 
(2e**, e", 0), X(t) = (2e™, 0, e?") form a basis for the solution 
space. 


and 2. We now calculate the 


13.2.2 Real Eigenvalues with no Eigenyector Basis 


What we have done in the last sub-section appears to have a very ad hoc 
flavour about it. We have looked at the system X’ = AX and we have 
seen that if 4 has an eigenvector basis then we can write down the general 
solution of our system. Now we know that not every matrix A has an 
eigenvector basis (Unit 10). In this sub-section we will look at what we have 
done so far in a new light so that we can deal with the case where the 
characteristic polynomial of A can be written as a product of (real) linear 
factors, even if A does not have an eigenvector basis. 


First of all then, can we look at what we have done in the previous sub- 
section from a different point of view? There A is diagonable; that is, 
there is a matrix P such that P~'AP is a diagonal matrix with the eigen- 
values of A down the main diagonal. P, the matrix of transition, is a matrix 
whose columns are eigenvectors of A (Unit 10, Section 2). Can we exploit 
this to solve our systems of equations? We can. If X’ = AX is our system 
and S any non-singular matrix then we rewrite our system in the form 


X’ = SS~'A4SS7'X 

and interpret S as an endomorphism of U, writing 
(S7 XY = S~1AS(S 7X). 

Finally putting Y as the image of X under S~', i.e., Y = S~'X, we find 
Y’ =(S~!AS)Y. 


So we have reduced the system X’ = AX to another system Y' = (S~1AS)Y. 
Now note that if X is a solution of the first system then S7 'X is a solution 
of the second, and conversely if Y is a solution of the second then SY isa 
solution of the first. That is, we have equivalent systems. To solve the 
system X’= AX, we can solve Y’=(S -14S)Y, and the solution of 
X’ = AX is X = SY. 

This is what we have been doing in the preceding sub-section, where S was 
the matrix of transition which made S~'AS diagonal. In practice, rather 
than evaluate the transition matrix P and then compute the result X = PY, 
it is usually more straightforward to proceed “ by hand”. 


13,2.1/13.2.2 


Example 
We reconsider Example 1 on page S25]. Here 


A= i il 
1 -i 
and an eigenvector basis is 
E, = (3,1) and E_, =(-1, 1). 
For each t e J, X(f) is a vector in R” so that 
X(0 = y (DE; + y2QE-2. 
In this way we define two real-valued functions y,, Y2 by 
Yi: Pese ill). 
Using this expansion in the matrix equation 
X’ = AX 
we obtain 
YE, + (QE. = yi (NAE + y2()AE_2 
= 2y, (NE, — 2y2(QE-2. 


This is the equivalent system Y’ = (P~'AP)Y. But E, and E_, are linearly 
independent in R?, so that 


Vi = 2.) 
V(t) = —2y2(t) 
with general solutions 


n= ce, y) = cze”. 


Thus 

X() = c,e"E, +c3e7”E_ 3. 
This is the solution found on page $252. 
Doing the example this way is longer than applying Lemma 5-3 on page 
S251, because we need the eigenvalues and eigenvectors for A and having 
got these we can write down the solutions immediately using Lemma 5-3. 
But the advantage of this method is that it is constructive so that we can 


use it for those cases where A is not diagonable. We illustrate how it works 
in the next example. 


Example 


Solve the system X’ = AX, where 


A=— 1 1. 0 
-1 3° 0 
-1 4-1 


We saw in sub-section 10.3.2 that A does not have an eigenvector basis. 
However, its Jordan basis is {E,, E,, £4} where 


E, = (0, 0, 1) AE, = —E, 
E, =(1,1,1) AE, = 2E, 
E} = (-1,0, 0) AE}, = 2E, +E, 


and hence the matrix 
P=[0 1-1 
0 1 0 
1 it 0 


16 


is a matrix of transition such that P ~ 


P“'AP=[~-1 0 0 
021 


‘AP is in Jordan normal form. 


00 2 
So we consider the equivalent system 
Y'=(P"14P)Y, 


where, as in the previous example ¥ = (y,, y3, y3) and 


X=PY=)y,E, + YE, +y, EY, 
This may be written as 


w= 
y2 = 2y, + y3 
Ys = 2y. 


Solving this system, we find y,(f) = ce‘, y(i) = c3e™ and p, satisfies 
y(t) = 2y2(1) + e567", 
Using the integrating factor method (Unit 4) we find that 
Yalt) = c3 te” + c, e”, 
We now have that 
X(ġ = PY(1) 
= WCE, + y(QE, + ys(NE, 
= ¢Eye™' + c, E, e” + ¢,(E, te” + E! e") 
is the general solution. 


In general, when faced with a system X’ = AX, we first find the Jordan 
basis for A. If this is possible over the reals we have obtained a real matrix 
P such that J = P~*AP is in Jordan normal form. It is then a simple matter 
to solve Y’ = JY, and having found its general solution, we know that 
X = PY is the general solution of X' = AX. 

Exercise 


Find the general solution of X’ = AX where 


A=fl 3 =2 
0 F = 
0 9 s3 


(Hint: A Jordan normal form of A is 


110 
010 
001 
with matrix of transition 
3 0 1 ; 
6 1 0 a 
9 0 0 
(See Unit 10, sub-section 10.3.3).) 


Solution 
We follow the procedure of this sub-section. We solve the auxiliary 
system 


yi=f1 1 UY 
010 
001 


17 


that is: 


J= tye 
J2=)o 
¥3=)3- 


The general solutions are 
yN =c, y= ce, yi) = este t cet 
X' = AX has the general solution 
X=PY 
= (3, 6, 9)y, + (0, 1, Oy, + (1, 0, O)ys 
= ¢,[(3, 6, 9e'] + c2[(3, 6, 9)te’ + (0, 1, Oe"] + c3(1, 0, Oe"). 


13.2.3 Complex Eigenvalues 


In our discussion of constant-coefficient systems of differential equations 
X' = AX, we have so far not considered the case when 4 has insufficient 
real eigenvalues. In Unit 10 we developed the Jordan normal form for 
such an A by extending our field of scalars to the complex numbers. 
We found that in this more general setting we could always find a matrix of 
transition P which reduces A to Jordan normal form, except that now P 
might have complex entries. We could of course still do this and consider 
the system Y’=(P~'AP)Y, but we would have to solve equations 
Yi = AY; + Vins OF y; = Ay;, where 1 may be complex. 


Again there are two cases, corresponding to the existence or otherwise of 
an eigenvector basis. In this sub-section we consider cases where the 
Jordan normal form contains complex elements but is diagonal. The more 
general case can be treated by an extension of these ideas. 


If A is an n x n matrix all of whose entries are real then Theorem 5-11 on 
page S228 tells us that the system X'= AX must have as basis for its 
solution space a set of » real n-tuples of functions, even though A has 
complex eigenvalues. In fact we can find these » real-valued solution 
functions for the case where A has a complex eigenvector basis. In the next 
reading passage we see how these solutions can be found. 


READ from line — 10, page S252, to line —3, page S256. 


Notes 


Ci) The first part of this reading passage up to the end of Example 2 in page 
$254 covers material which should not be new to you. The vector space C" was 
introduced in Unit J, and Unit 5 did not place any restriction on the field of 
scalars; so the discussion in this complex case is a revision of Unit 5, Section 2. 
The first new result is Lemma 5-4 on page S253 which relies specifically on the 
fact that A has real entries although it represents an endomorphism of C", 

(ii) Lemma 5-6, page S255 If at first sight the Lemma appears complicated, 
read through Example 3 then return to the Lemma. 

(iii) dines —8 and —7, page $256 The characteristic or auxiliary equation is 
defined in Chapter 4 of K. You may recall that in a similar fashion the auxiliary 
equation of a recurrence relation (Unit 7) is the characteristic equation of the 
associated matrix (sub-section 5.3.2). 


Exercises 
1. Exercise 1, page S257. 
2. Exercise 6(a), page S258. (The characteristic equation is 23 + 64 = 0.) 


18 


Solutions 


L. ae a 


=A? 447 =0 


—a —À 
Therefore A = ia is one eigenvalue, Since 
—ia a|[x] =[0 
—a —iaj|y 0 
(1, i) is an eigenvector. 
Using Lemma 5-6. 
Gia a a, 0), Hia = (0, = 1). 
Therefore 
X(t) = (1, 0) cos at + (0, — 1) sin at = (cos at, —sin at) 
X(f) = (0, —1) cos at — (1, 0) sin at = (—sin at, —cos at) 
The general solution is 
X() = cosat —sinat][c, 
—sinat —cos at | |c, 
X(t) € 2. 


Let 2? be represented by a geometric plane and let X(t) be the 
position vector of a particle. Then, there exists ae 2t, 
@ €(—2, n] such that 


c, =a sine 
Cy =Q COS E 
Therefore 


X(t) = (c, cos at — c3 sin at, —c, sin at — c, cos at) 
= a(sin(e — at), —cos (e — at)) 


+y 


xt 


a 
Ra h initial position 
7 of particle 
* position of particle 
at time t 
Sx 


Thus the particle moves clockwise along a circular path with 
angular velocity a radians/unit time. 


We have the system X’ = AX, where 


A=f0O -1 -2 
1 0 1 
2 -1 0 


We first calculate the eigenvalues for A. The characteristic equa- 
tionis? + 64 = 0, sothat the eigenvalues are 0, +if6, —i f6. 


19 


We now calculate corresponding eigenvectors. Suitable 
choices are 


l 1+2i/6 
E=| 2], Eve=| 2-1/6], 
-1 5 
1-2/6 


E_ve=| 24/76 
5 


Using the terminology of Lemma 5-6, we have 


and the general solution is 


1 it -7 _ 
al 2 +a( cos ./6t + | | Ji 
+ ofl ivi cas ót- iG sa 
0 5 


13.2.4 Summary of Section 13.2 


There were no new terms defined in this section. 


Theorems 


1. (Lemma 5-3, page $251) 
For each real eigenvalue 4 of A and each eigenvector 


E= fe, 
en 


belonging to A, the function X, = E,e*! is a solution of X’ = AX. More- 
over, solutions formed in this way from distinct eigenvalues are linearly 
independent in V,. 

2. (Corollary 5-3, page S251) 

If A has n distinct real eigenvalues A,,...,4, and if Ey...) Ez, are 
eigenvectors belonging to these eigenvalues, then the general solution of the 
normal first order system X’ = AX is 


X() = Eye +--+ +c, Ee, 
where c,, ..., ¢, are arbitrary constants. 


3. (Theorem, page C14) 
If the eigenvectors E,,..., E, of an n x n matrix A form a basis for R”, 
then the general solution of X’ = AX is 
X(t) = c Ee + +++ + 6, Bet, 
where E; is an eigenvector corresponding to 4;. 
4. (Lemma 5-4, page $253) 
Let A be an n x n matrix with real entries, and suppose that 4 = g + ĝi is 


an eigenvalue for A. Then if Z is an eigenvector belonging to 4, Z, the 
complex conjugate of Z, is an eigenvector belonging to i = « — Bi. 


20 


5. (Lemma 5-5, page S254) 


Let A be a real n x n-matrix, and suppose that E; is an eigenvector in @” 
belonging to the complex eigenvalue 2 =a + fi of A. Then 


Ee and Ee" 
are solutions of the equations X’ = AX. 
6. (Lemma 5-6, page $255) 


Let 2 =a + Bi be a complex eigenvalue for the n x n-real matrix A, and 
let E, be an eigenvector in ©" belonging to 4. Then the functions 


X(f) = e(G, cos Bt -+ H, sin Bd), 
Xi) = e"CH, cos Br — G, sin Bt) 


E, +E i(E, — E 
where G, = Aang m = aE) 


are linearly independent solutions of X’ = AX. 


Technique 


Solve a normal system of equations X’ = AX were A is a matrix of real con- 
stants. The steps may be summarized in the following flow chart. 


Find eigenvalues of A 


Write down 
solutions {13.2.1 
for real eigen- 
values) (13.2.3 
forcomplex 
aigenvatues) 


eigenvector 
basis 


Calculate Eigenvectors 


Ro vigenvector 
basis 


blocks of complox 
eigenvalues 


STOP not 
develapedin 
this unit 


Calculate Jordan normal 
form ot 4 and matrix of 
wansition P 


all 1's off diagonal 
correspond toreal eigen- 
values 


STOP or substituto 
solution for checking 
inX' - AX 


Calculate solution 


Notation 


E, (page $251) 
X, (page S251) 
EP (page C14) 

G, (page S255) 
H, (page $255) 


LM 13.2.4 


13.3 APPLICATIONS 
13.3.0 Introduction 


In earlier units we have seen how differential equations describe physical 
situations involving mechanical or electrical vibrations. In practice more 
than one elementary system may be linked to form a complicated one. 
These may be described by a system of linear differential equations, and 
where the coefficients are constant we may be able to solve them by the 
methods of Section 2. In this section we consider various physical systems 
and the equations which describe their behaviour. 


We have seen (Unit 4, Differential Equations I) that a modelling situation 
involves three stages: 


(i) setting up the equation, 
(ii) solving the equation, 
(iii) interpreting the solution. 


In this section we follow through this programme for a mechanical system 
and we also set up equations for electrical networks. 


13.3.1 A Double Mass Spring System 


The first system we shall discuss consists of two masses joined by three 
springs. 


READ Example | of Section 5-12 from pages S259 to line 10 of S263. 


Notes 


(i) Equations (5—51I), page S259 The deiivation of these equations has been 
somewhat rushed. The most general situation has been described even though 
the rest of the example only considers a special situation. Two physical laws have 
been used. The first is Newton’s law which relates the force on a particle to the 
acceleration of the particle. The second is Hooke’s law which says that when 
a spring of stiffness k is extended by a distance x there is a (restoring) 
force kx in the opposite direction. We have two equations, one for each particle. 
In the first equation we have four forces acting on the mass. The first term comes 
from the first spring, the second term from the second spring, the fourth term 
from an external applied force and the third term from the viscous medium. 
(ii) line —5, page S259 The simplifying assumptions are that all three springs 
are identical. 

(iii) line 2, page S260 The equations (5-51) and (5-52) describe general spring 
systems where the end “walls” described by xo and x3 can also move (as in the 
springing in a car). What S does here is to suppose these “ walls” are fixed so 
that xo and x, are constant. 

(iv) line 11, page S260 “momentum variables”. Here we are making a sub- 
stitution, introducing new functions y3, ys defined by ys = 7X1, Ya = M2 X2- It 
so happens that there is a physical interpretation of these functions as momentum. 
but it is sufficient in this section to think of them as new functions. The object of 
the substitution is to bring the system of equations to first order. 

(vy) line —9, page S261 To see that (p, + #2)? — 3u p2 >0, note that p, >0, 
2 >0 and (p, + p2} — 3p pe = (pi — pa)? + pipa, which is the sum of two 
positive terms and so is positive. Also note that (p + p2)? > (p1 + p2)? — 3ypaper 
so that in line —13 both values of à? are negative (since k > 0). 

(vi) lines —4 to —1, page S261 This set of equations is AY = AY. 

(vii) line 2, page S263 The equations for A; and 42 are: 


tan 42 = B2/B,, A, =4/ B} + B} 
This device was used in Unit 4, sub-section 4.4.2, and in Solution 1 of sub-section 
13.2.3. 


(viii) fines 5-6, page S263 w and v are the natural frequencies of the system. The 
two solutions 


lt), y(t) = (« cos (wt — A2), (2% — =) cos (wt — aa) 


22 


LM 13.3.0/13.3.1 


and 


610), ysl) = (x cos (vt — A,), (2 s =) dos b= 40) 
1 


are the normal modes of oscillation. This particular simple behaviour arises 
because we set di =d: =0 and so obtained. a quadratic in À? instead of the 
more general quartic in A. Thus, it would in fact have been much faster to 
write the system of equations (5-52) as follows: 


mi}, = —2ky, + kya 
mız =kyı — 2ky2 
where yı =x; — ṣi, y2 = X2 — 32. 


Thus we have the normal second-order system 


2 1 
D Sho 
24 m mı a 
i 1 2 
4 mo è m ze 


which can be solved simply and directly using the methods of sub-section 13.2.2, 
The reduction to first-order is essential only when d, or d, is non-zero so that 
there are some lower order derivatives. 

(ix) lines 8 and 9, page $263 These frequencies can be obtained by putting 


1 
wh =.= p= = in the formula for A? half-way down page S261. 


Exercises 


1. Evaluate | A — 2J| for the matrix A on page S260 and hence obtain the 
eigenvectors of A in terms of A, for the case 


m, =m, =m>0. 
d,=d,=d>0. 
2. Find the general solution of the system of equations (5-52) on page 
$259 under the conditions 
my =m,=m>0 
Fi) = F,() =0 
d =d =d>0 
d? > 12km. 


Solutions 


1. With p= 2 , |A — Al| becomes in this case 
m 


-2 0 u 0 
0 -A 0 u 
-2k k —(A+dy) 0 
k -2k 0 —(A + dp) 


To evaluate | A — AZ| we reduce it to triangular form by row 
operations. In the notation of Unit M100 26, Linear Algebra 
IH, let 

R ——> -ARg + 2kR, — KR, 

R, > —-AR, — kR, + 2kR, 


Thus 
[4-4] =47?| -4 0 u 0 
0 =å 0 lai 
0 0 AA+ dp) +2ku ak 
0 0 —kp A(A + dp) + 2kp 


23 


24 


ku 


Re Ret aa ek 
JA -àl =A7-?7 | -2 0 H 0 
0 -A 0 H 
0 0 +d) +Ê2ku —ku 
0 0 0 z 
ye 


=A 2ku — ———— - 
where z = A(A + du) + 2kp Wd + du) + Zka 


Hence |A — AZ| = (AQ + di) + 2kp)P — 712. 


From the above, the row reduced form of A — Al is 


-À 0 H 0 
Oo —} 0 u 
0 0 AA+ du) +2kp —ku 
0 0 0 z 


Hence, since (A — AI)E, = 0, we obtain 
E, = (k, 42m + dd + 2k, Akm, Am? m + dì + 2k)) 
Using Exercise 1, 


dì k dà 3k 
[4 -11| = (x +24 Ke, 
m m, 


m m 


Hence 
Ay, Ag = (—d + fd? — 4km)/2m 
Ay, 4g = (-d + fd? — 12km){2m 


Thus, since d? > 12km all the eigenvalues are real, negative 
and distinct. Thus the problem falls into the class we dis- 
cussed in sub-section 13.2.1. Corresponding eigenvectors are 
E, = (k, k, Aykm, ikm) 
E, = (k, k, A, km, 4, km) 
E, = (k, —k, ła km, —A3 km) 
E, = (k, —k, 24km, — 14km) 
From page $260 
y =m), 
J= mj 
hence we only require y, and y3: 
yalt) = cje! + ezett + ege? + eqe™! 
y= cent Heret — ege — oeg, 


(Note that we have absorbed a common factor of k into the 


arbitrary constants.) Since each A; is negative lim y,(f) and 
t large 
lim p(t) are zero: i.e. in the long run damping brings the 


targe 


system to rest. 


13.3.2 Electrical Networks 


In this final sub-section we return to a physical situation that we have met 
before: electrical networks. In problems involving networks each possible 
circuit has a current and the situation is described by a system of differen- 
tial equations. We will not attempt to solve any equations in this sub- 
section but will concentrate on setting up such systems of equations. 


READ Example 2 on page $263 as far as line —6, the sentence in brackets. 


Notes 


(i) line —13 page $263 “applied electromotive force” means applied voltage. 


(ii) line —10, page $263 et seq. For Kirchhoff’s Jaws, see page K170. To obtain 
the differential equations we consider 


(a) the R-L-L- E circuit, and 

(b) the R-L-C circuit 

and use the form of the voltage drops given on pages K170 and 172. 

To reduce the system to normal form we first climinate di/dt from the first 
equation, obtaining 


di, l 

LẸ =gh +O). 
ie. 

da alLa, Bw 

a EC? TL 

di Ri, 1l 

a ~ Lt re? 


We then note that i: =i, — i (applying Kirchhoff’s first law to either node of the 
network), so that 


ca 
a 


You should note that it is convenient to use the currents i, as the unknown 
functions except for wires containing a condenser in which case the corresponding 


i should be replaced by a R 
READ Example 3 on page S265. 


The two circuits considered here are the R,-R-L-E circuit and the R,-C 
circuit. In normal form the equations may be written 


di, R, Ri]. 1 
sii ee = E(t 
di ET Ey zœ 
= + 

dis 1 1 

23 a lh 0 

dt RC R,C 
In this form the system is incomplete: there is also the additional equation 
a =i -i 
dt 1 3 


25 


LM 13.3.2 


Exercise 


Consider the following network and write down a normal system of 
equations which describes its behaviour. 


E(t) 


Solution 


For the E-L-C, circuit we have 


G2 di, aq, 

koa — = E(f =—. 

Cm FO. be 
For the C,-R-E circuit we have 

q dq 

Tt R= EO, kha 


Because we have a condenser and an inductance in the same wire, 
it is necessary to use both i, and q, as unknowns. We use q; as the 
third unknown. The system is 


diy q2 l 
a ARG 
dt IG TL o 
a” 

dq; q3 1 
at RG R O 


nee ie A P . , 
In addition i, = T — i}, but this equation does not appear in 


. di 
the system since a does not appear. 


13.3.3 Summary of Section 13.3 
In this section no new theory was developed. We modelled a mass-spring 


system and electrical networks, expressing the systems of differential 
equations in normal form. 


26 


13.4 Summary of the Unit 


In this unit we developed a way of solving directly a system of differential 
equations. 


In the first section we discussed the structure of a system of differential 
equations and what is meant by the normal form of such a system. We 
restricted ourselves to the first-order case for there are many higher order 
systems which can be made equivalent to first order systems. The last two 
sub-sections considered the existence of solutions to the system and the 
nature of the resulting solution space. The second section developed a 
method for solving constant-coefficient homogeneous systems. The steps 
are summarized in the form of a flow chart at the end of that section. The 
last section then applied the theory learnt to two particular physical 
systems: a double mass spring system and electric circuits. 


Definitions 
system of linear differential equations (page S224) 
homogeneous system (page S224) 
nonhomogeneous system (page $224) 
operator matrix (page $224) 
normal first-order system (page $225) 
Wronskian (page C10) 

Theorems 

1. (Theorem 5-11, page S228) 

Let 


X' = A(OX + Bi) 
be a normal # x n-first-order system of linear differential equations 


defined on an interval J. Then if tọ is any point in J and X, is any vector in 
&", the given system has a unique solution X = X(t) such that X(to) = Xo. 
2. (Initial Condition Isomorphism Theorem, page C8) 

Let W be the solution set of X’(t)= A(t)X(4) + Bt), tel, and let 
E: W—— 2" be specified by E(X) = X(to), to € T. Then £ is one-to-one 
and onto. 

3. (Theorem 5-12, page $229) 

The dimension of the solution space W of any homogeneous » x n-system 
X’ = A(X is n, the number of equations in the system. 


4, (Lemma 5-3, page S251) 
For each real eigenvalue 1 of A and each eigenvector 


belonging to A, the function X, = E, e% is a solution of X' = AX. More- 
over, solutions formed in this way from distinct eigenvalues are linearly 
independent in V, . 

5. (Corollary 5-3, page $251) 

If A has n distinct real eigenvalues A,,...,2, and if E,,,...,E,, are 
eigenvectors belonging to these eigenvalues, then the general solution of the 
normal first-order system X’ = AX is 


X() = GE to... $B es 


where ¢c,,..., Ca are arbitrary constants. 
6. (Theorem, page C14) f ; 
If the eigenvectors E,,..., E, of an x x n matrix A form a basis for R", 


ti wy Ant 
then the general solution of X’ = AX is X(*) = G Eje™! + + Ene, 
where E; is an eigenvector corresponding to A. 


27 


LM 13.4 


* 
+ tk OF 


7. (Lemma 5-4, page $253) 

Let A be an n x n matrix with real entries, and suppose that A =a + fi is 
an cigenvalue for A. Then if Z is an eigenvector belonging to A, Z, the 
complex conjugate of Z, is an eigenvector belonging to 1 = a — i. 


8. (Lemma 5-5, page $254) 
Let A be a real n x n-matrix, and suppose that E, is an eigenvector in C” 
belonging to the complex eigenvalue A = « + Bi of A. Then 
Ee” and E,e* 
are solutions of the equation X’ = AX. 
9. (Lemma 5-6, page S255) 
Let 2 =a + fi be a complex cigenvalue for the n x n real matrix A, and 
let E, be an eigenvector in C” belonging to A. Then the functions 
X(t) = e*"(G, cos Bt + H, sin Br) 
X(t) = e(H, cos ft — G, sin Br) 


E, +E, i(E, —E,)) 


where G, = and H, = 7 


are linearly independent solu- 


tions of X’ = AX. 


Techniques 


1. Given a system of first-order linear differential equations, put it into 
normal form. 


2. Solve a normal system of equations X’ = AX where A is a matrix of 
real constants. (See the summary of Section 13.2.) 


3. Derive a system of equations for a mass-spring system, and an electrical 
network 


Notation 

x (page S224) 
VU, (page S225) 
E (page C8) 
E, (page C10) 
WIX,,..-.X,] (page C10) 
E, (page S251) 
X, (page S251) 
EP (page C14) 
G, (page S255) 
H, (page S255) 


28 


13.5 SELF-ASSESSMENT 
Self-assessment Test 


This Self-assessment Test is designed to help you test quickly your under- 
standing of the unit. It can also be used, together with the summary of the 
unit for revision. The answers to these questions will be found on the next 


non-facing page. We suggest you complete the whole test before looking at 
the answers. 


1. Write the differential equation 
D*y(x) + xD*y(x) + 4y(x) = cos x 
as a normal first-order system of equations in matrix form. 


2. Reduce the following system of equations to a normal first-order 
system in matrix form 


4Dyi (x) + 6y1(x) + Dyo(x) = e 
2y2(x) + Dy(x) — 3Dy2(x) = e7* 
3. Solve the system of equations X’ = AX, where 


A= f i with initial condition X(0) = (0, 1). 
23 


4. Consider the following network and find a normal first-order system 
of equations which describe its behaviour. 


iy is Ry is 


29 


Solutions to Self-assessment Test 


1 0 OVX) +7 0 
0 1l 0 0 
0 0 1 0 
0 -=x 0 cos x 


2. First treat Dy,(x) and Dy,(x) as algebraic unknowns. 


4 PD] _ f-Gr@) +e 
L =3} Daw] T L-a) +67" 
R, —> 3R, +R: 
Ri —> Ri- Ra 


13 0) f Dy) 
0 -3}| Dy) 
= F 18y, (x) — 2y2(x) + 3e* + e77 | 
Hgy (a) —2$y2(x) — aye" + He 
Rearranging 


Pas = e =| pri + e + al 
Dy(x) -6 bD% e — 4e? 
3. TEF Ol P 2 
E si |-3 444+H-—-2=4 44+1=0 
ie. 2=24/4-1 
=24+/3. 


Eigenvectors are 
-1 
= [12)] 


Ey =(-1, -1- 3) 
E,-y3 =(-1, -1 +/3) 
Therefore the general solution is 


e(l, 1 — Jf Be 44 + (1-1 + 3)" 
Since X(0) = (0, 1), we have c, +c, =0 


=c; -3a -e +/3e, = 1 


eo ee 
V3 
Pena ge S 
2/3 2/3 
Solution is: 
et 


Meat Ry ~V3e 
ae +(=1, —1 + J3)e7%34 


4. Use (i) C-L, circuit 
Gi) L,-E-R, circuit 
(iii) L3-R,-£ circuit 


30 


di 
(i) 2,58 +R = 0 
ae 
G) L, T +i; R, = E(t) 


By Kirchhoff’s first law: i, = i, — i = iy — (i — ig) = ig 


In normal form we have 


M, 

dt 
dy =u 
dt T L,C 


Other solutions are possible. 


31 


LINEAR MATHEMATICS 


HFSvmrd yavawn 


Ree ee eH 
SOoOmrIAKnUAWHN 


NNN 
OWN 


NNN lV 
SAH 


WwWWwWwWwNnN 
BON OCC 


32 


Vector Spaces 

Linear Transformations 

Hermite Normal Form 

Differential Equations I 

Determinants and Eigenvalues 

NO TEXT 

Introduction to Numerical Mathematics: Recurrence Relations 
Numerical Solution of Simultaneous Algebraic Equations 
Differential Equations II: Homogeneous Equations 
Jordan Normal Form 

Differential Equations III: Nonhomogeneous Equations 
Linear Functionals and Duality 

Systems of Differential Equations 

Bilinear and Quadratic Forms 

Affine Geometry and Convex Cones 

Euclidean Spaces I: Inner Products 

NO TEXT 

Linear Programming 

Least-squares Approximation 

Euclidean Spaces II: Convergence and Bases 

Numerical Solution of Differential Equations 

Fourier Series 

The Wave Equation 

Orthogonal and Symmetric Transformations 
Boundary-value Problems 

NO TEXT 

Chebyshev Approximation 

Theory of Games 

Laplace Transforms 

Numerical Solution of Eigenvalue Problems 

Fourier Transforms 

The Heat Conduction Equation - 

Existence and Uniqueness Theorem for Differential Equations 
NO TEXT 


335 01103 9 


